Explorar el Código

Update README.md

master
Christopher Tensmeyer GitHub hace 8 años
padre
commit
a5506d45f6
Se han modificado 1 ficheros con 98 adiciones y 1 borrados
  1. +98
    -1
      README.md

+ 98
- 1
README.md Ver fichero

@@ -1 +1,98 @@
# pagenet
# PageNet

PageNet is a Deep Learning system that takes in an image with a document in it and returns a quadrilateral representing the main page region. We trained PageNet using the library [Caffe](caffe.berkeleyvision.org). For details, see our [paper](https://arxiv.org/abs/1709.01618).

## Usage

There are three scripts in this repo. One for training networks, one for predictions using pre-trained networks, and one for rendered quadrilateral regions.

### Testing Pretrained Models

We have provided two pretrained models from our paper. One model is trained on the CBAD dataset and the other is trained on a private collection of Ohio Death Records provided by [Family Search](https://www.familysearch.org/).

`test_pretrained.py` has the following usage

```
usage: test_pretrained.py [-h] [--out-dir OUT_DIR] [--gpu GPU]
[--print-count PRINT_COUNT]
image_dir manifest model out_file

Outputs binary predictions

positional arguments:
image_dir The directory where images are stored
manifest txt file listing images relative to image_dir
model [cbad|ohio]
out_file Output file

optional arguments:
-h, --help show this help message and exit
--out-dir OUT_DIR
--gpu GPU GPU to use for running the network
--print-count PRINT_COUNT
Print interval
```
`image_dir` is the directory containing images to predict. The file paths listed in `manifest` are relative to `image_dir` and are listed one per line. `model` should be either `cbad` or `ohio` to select which trained model to use. `out_file` will list the coordinates of the quadrilaterals predicted by PageNet for each of the input images.

`--gpu` is for passing the device ID of the GPU to use. If it is negative, CPU mode is used. Specifying `--out-dir` will allow you to dump both the raw and post processed predictions as images.


### Training

`train.py` has the following usage

```
usage: train.py [-h] [--gpu GPU] [-m MEAN] [-s SCALE] [-b BATCH_SIZE] [-c]
[--image-size IMAGE_SIZE] [--gt-interval GT_INTERVAL]
[--min-interval MIN_INTERVAL] [--debug-dir DEBUG_DIR]
[--print-count PRINT_COUNT]
solver_file dataset_dir train_manifest val_manifest

Outputs binary predictions

positional arguments:
solver_file The solver.prototxt
dataset_dir The dataset to be evaluated
train_manifest txt file listing images to train on
val_manifest txt file listing images for validation

optional arguments:
-h, --help show this help message and exit
--gpu GPU GPU to use for running the network
-m MEAN, --mean MEAN Mean value for data preprocessing
-s SCALE, --scale SCALE
Optional pixel scale factor
-b BATCH_SIZE, --batch-size BATCH_SIZE
Training batch size
-c, --color Training batch size
--image-size IMAGE_SIZE
Size of images for input to training/prediction
--gt-interval GT_INTERVAL
Interval for Debug
--min-interval MIN_INTERVAL
Miniumum iteration for Debug
--debug-dir DEBUG_DIR
Dump images for debugging
--print-count PRINT_COUNT
How often to print progress
```
`solver_file` points to a caffe solver.prototxt file. Such a file is included in the repo. The training script expects that the network used for training to begin and end like the included `train_val.prototxt` file, but the middle layers can be changed.
`dataset_dir` is the directory containing the training and validation images. The file paths listed in `train_manifest` and `val_manifest` are relative to `dataset_dir` and are listed one per line.

`--gpu` is for passing the device ID of the GPU to use. If it is negative, CPU mode is used.

The optional arguments have reasonable defaults. If you're curious about their exact meaning, I suggest you look at the code.

### Rendering Masks

The usage for `render_quads.py` is
```
python render_quads.py manifest dataset_dir out_dir
```

`manifest` lists the image file path and quadrilateral coordinates. It should be the `out_file` of `test_pretrained.py`. The filepaths in `manifest` are relative to `dataset_dir`. `out_dir` is an output directory where quadrilateral region images are written


## Dependencies

The python scripts depend on OpenCV 3.2, Matplotlib,

Cargando…
Cancelar
Guardar