diff --git a/paperless-ngx/read.me b/paperless-ngx/read.me new file mode 100644 index 0000000..86ef321 --- /dev/null +++ b/paperless-ngx/read.me @@ -0,0 +1,271 @@ +
Paperless runs on linux only. The following procedure has been tested on +a minimal installation of Debian/Buster, which is the current stable +release at the time of writing. Windows is not and will never be +supported.
+Paperless requires Python 3. At this time, 3.10 - 3.12 are tested versions. +Newer versions may work, but some dependencies may not fully support newer versions. +Support for older Python versions may be dropped as they reach end of life or as newer versions +are released, dependency support is confirmed, etc.
+Install dependencies. Paperless requires the following packages.
+python3python3-pippython3-devdefault-libmysqlclient-dev for MariaDBpkg-config for mysqlclient (python dependency)fonts-liberation for generating thumbnails for plain text
+ filesimagemagick >= 6 for PDF conversiongnupg for handling encrypted documentslibpq-dev for PostgreSQLlibmagic-dev for mime type detectionmariadb-client for MariaDB compile timelibzbar0 for barcode detectionpoppler-utils for barcode detectionUse this list for your preferred package management:
+python3 python3-pip python3-dev imagemagick fonts-liberation gnupg libpq-dev default-libmysqlclient-dev pkg-config libmagic-dev libzbar0 poppler-utils
+These dependencies are required for OCRmyPDF, which is used for text +recognition.
+unpaperghostscripticc-profiles-freeqpdfliblept5libxml2pngquant (suggested for certain PDF image optimizations)zlib1gtesseract-ocr >= 4.0.0 for OCRtesseract-ocr language packs (tesseract-ocr-eng,
+ tesseract-ocr-deu, etc)Use this list for your preferred package management:
+ +On Raspberry Pi, these libraries are required as well:
+libatlas-base-devlibxslt1-devmime-supportYou will also need these for installing some of the python dependencies:
+build-essentialpython3-setuptoolspython3-wheelUse this list for your preferred package management:
+ +Install redis >= 6.0 and configure it to start automatically.
Optional. Install postgresql and configure a database, user and
+ password for paperless. If you do not wish to use PostgreSQL,
+ MariaDB and SQLite are available as well.
Note
+On bare-metal installations using SQLite, ensure the JSON1 +extension is +enabled. This is usually the case, but not always.
+Create a system user with a new home folder under which you wish + to run paperless.
+ +Get the release archive from + https://github.com/paperless-ngx/paperless-ngx/releases for example with
+curl -O -L https://github.com/paperless-ngx/paperless-ngx/releases/download/v1.10.2/paperless-ngx-v1.10.2.tar.xz
+Extract the archive with
+ +and copy the contents to the
+home folder of the user you created before (/opt/paperless).
Optional: If you cloned the git repo, you will have to
+compile the frontend yourself, see here
+and use the build step, not serve.
Configure paperless. See configuration for details.
+ Edit the included paperless.conf and adjust the settings to your
+ needs. Required settings for getting
+ paperless running are:
PAPERLESS_REDIS should point to your redis server, such as
+ PAPERLESS_DBENGINE optional, and should be one of postgres,
+ mariadb, or sqlitePAPERLESS_DBHOST should be the hostname on which your
+ PostgreSQL server is running. Do not configure this to use
+ SQLite instead. Also configure port, database name, user and
+ password as necessary.PAPERLESS_CONSUMPTION_DIR should point to a folder which
+ paperless should watch for documents. You might want to have
+ this somewhere else. Likewise, PAPERLESS_DATA_DIR and
+ PAPERLESS_MEDIA_ROOT define where paperless stores its data.
+ If you like, you can point both to the same directory.PAPERLESS_SECRET_KEY should be a random sequence of
+ characters. It's used for authentication. Failure to do so
+ allows third parties to forge authentication credentials.PAPERLESS_URL if you are behind a reverse proxy. This should
+ point to your domain. Please see
+ configuration for more
+ information.Many more adjustments can be made to paperless, especially the OCR +part. The following options are recommended for everyone:
+PAPERLESS_OCR_LANGUAGE to the language most of your
+ documents are written in.PAPERLESS_TIME_ZONE to your local time zone.Warning
+Ensure your Redis instance is secured.
+Create the following directories if they are missing:
+/opt/paperless/media/opt/paperless/data/opt/paperless/consumeAdjust as necessary if you configured different folders. +Ensure that the paperless user has write permissions for every one +of these folders with
+ +If needed, change the owner with
+ +Install python requirements from the requirements.txt file.
This will install all python dependencies in the home directory of +the new paperless user.
+Tip
+It is up to you if you wish to use a virtual environment or not for the Python +dependencies. This is an alternative to the above and may require adjusting +the example scripts to utilize the virtual environment paths
+Tip
+If you use modern Python tooling, such as uv, installation will not include
+dependencies for Postgres or Mariadb. You can select those extras with --extra <EXTRA>
+or all with --all-extras
Go to /opt/paperless/src, and execute the following command:
When you first access the web interface you will be prompted to create a superuser account.
+Optional: Test that paperless is working by executing
+ +and pointing your browser to http://localhost:8000 if
+accessing from the same devices on which paperless is installed.
+If accessing from another machine, set up systemd services. You may need
+to set PAPERLESS_DEBUG=true in order for the development server to work
+normally in your browser.
Warning
+This is a development server which should not be used in production. +It is not audited for security and performance is inferior to +production ready web servers.
+Tip
+This will not start the consumer. Paperless does this in a separate +process.
+Setup systemd services to run paperless automatically. You may use
+ the service definition files included in the scripts folder as a
+ starting point.
Paperless needs the webserver script to run the webserver, the
+consumer script to watch the input folder, taskqueue for the
+background workers used to handle things like document consumption
+and the scheduler script to run tasks such as email checking at
+certain times .
Note
+The socket script enables granian to run on port 80 without
+root privileges. For this you need to uncomment the
+Require=paperless-webserver.socket in the webserver script
+and configure granian to listen on port 80 (set GRANIAN_PORT).
These services rely on redis and optionally the database server, but +don't need to be started in any particular order. The example files +depend on redis being started. If you use a database server, you +should add additional dependencies.
+Note
+For instructions on using a reverse proxy, +see the wiki.
+Warning
+If celery won't start (check with
+sudo systemctl status paperless-task-queue.service for
+paperless-task-queue.service and paperless-scheduler.service
+) you need to change the path in the files. Example:
+ExecStart=/opt/paperless/.local/bin/celery --app paperless worker --loglevel INFO
Optional: Install a samba server and make the consumption folder + available as a network share.
+Configure ImageMagick to allow processing of PDF documents. Most + distributions have this disabled by default, since PDF documents can + contain malware. If you don't do this, paperless will fall back to + Ghostscript for certain steps such as thumbnail generation.
+Edit /etc/ImageMagick-6/policy.xml and adjust
to
+ +Optional: Install the + jbig2enc + encoder. This will reduce the size of generated PDF documents. + You'll most likely need to compile this by yourself, because this + software has been patented until around 2017 and binary packages are + not available for most distributions.
+Optional: If using the NLTK machine learning processing (see
+ PAPERLESS_ENABLE_NLTK for details),
+ download the NLTK data for the Snowball
+ Stemmer, Stopwords and Punkt tokenizer to /usr/share/nltk_data. Refer to the NLTK
+ instructions for details on how to
+ download the data.