Add ignore file pattern and enhance README

Updated the `.code-tokenizer-md-ignore` to include its own pattern for exclusion. Enhanced the README with a Quickstart section, detailed usage guidance, bundling process, and ignore file configuration.
This commit is contained in:
2024-11-24 11:17:28 -05:00
parent b10b9085b2
commit 97471cfe19
2 changed files with 78 additions and 13 deletions

View File

@@ -1,3 +1,4 @@
# This is just for testing to make sure the glob patterns work # This is just for testing to make sure the glob patterns work
# Check is valid when test-for-ignore.css is not included in the program output # Check is valid when test-for-ignore.css is not included in the program output
**/*.css **/*.css
**/.code-tokenizer-md-ignore

View File

@@ -1,12 +1,17 @@
# code-tokenizer-md # code-tokenizer-md
> Created to push creative limits. > Created to push creative limits. Processes git repository files into markdown with token counting and sensitive data redaction.
Process git repository files into markdown with token counting and sensitive data redaction. ## Quickstart
```
$ cd your-git-repo
$ npx code-tokenizer-md
```
#### Next Steps: Refine your outputs with [.code-tokenizer-md-ignore](#ignore-file-configuration)
## Overview ## Overview
`code-tokenizer-md` is a TypeScript/Bun tool that processes git repository files, cleans code, redacts sensitive information, and generates markdown documentation with token counts. `code-tokenizer-md` is a tool that processes git repository files, cleans code, redacts sensitive information, and generates markdown documentation with token counts.
```mermaid ```mermaid
graph TD graph TD
@@ -54,7 +59,9 @@ graph TD
## Installation ## Installation
```shell
npm install code-tokenizer-md
```
## Usage ## Usage
@@ -64,13 +71,6 @@ graph TD
npx code-tokenizer-md npx code-tokenizer-md
``` ```
### Library
###
```shell
npm install code-tokenizer-md
```
### Programmatic Usage ### Programmatic Usage
```typescript ```typescript
@@ -84,6 +84,70 @@ const generator = new MarkdownGenerator({
const result = await generator.createMarkdownDocument(); const result = await generator.createMarkdownDocument();
``` ```
`## Ignore File Configuration`
### .code-tokenizer-md-ignore
The `.code-tokenizer-md-ignore` file allows you to specify patterns for files and directories that should be excluded from processing. You can create this file in any directory within your project, and it will affect that directory and all subdirectories.
#### Features:
- Supports nested ignore files (multiple .code-tokenizer-md-ignore files in different directories)
- Uses glob patterns for matching
- Inherits patterns from parent directories
- Supports both relative and absolute paths
Example `.code-tokenizer-md-ignore` file:
```
# Ignore specific files
secrets.json
config.private.ts
# Ignore directories
build/
temp/
# Glob patterns
**/*.test.ts
**/._*
```
#### Pattern Rules:
- Lines starting with `#` are comments
- Empty lines are ignored
- Patterns are relative to the ignore file's location
- Use `**` for matching across directories
- Patterns without leading `/` or `**` are relative to the ignore file's directory
- Patterns with leading `/` are relative to the project root
## Bundling Process
The project uses Bun's built-in bundler for creating optimized production builds. The bundling process includes:
1. **Source Compilation**:
- TypeScript files are compiled using Bun's native TypeScript support
- Declaration files are generated using `bun-plugin-isolated-decl`
- Output is optimized for Node.js runtime
2. **CLI Bundling**:
- Separate bundle for CLI usage
- Compiled to native binary for improved performance
- Includes shebang for direct execution
3. **Output Structure**:
```
dist/
├── index.js # Main library bundle
├── index.d.ts # TypeScript declarations
└── code-tokenizer-md # CLI executable
```
4. **Bundle Configuration**:
- Target: Node.js
- Module Format: ESM
- Includes source maps
- Preserves path resolution
## Project Structure ## Project Structure
``` ```
@@ -122,7 +186,7 @@ src/
## Development ## Development
This project uses [bun](https://github.com/oven-sh/bun) for it's toolchain. You should be able to use whatever you want as a consumer of the library. This project uses [bun](https://github.com/oven-sh/bun) for its toolchain. You should be able to use whatever you want as a consumer of the library.
### Building ### Building
```shell ```shell