Ajouter paperless-ngx/read.me
This commit is contained in:
271
paperless-ngx/read.me
Normal file
271
paperless-ngx/read.me
Normal file
@@ -0,0 +1,271 @@
|
|||||||
|
<h3 id="bare_metal">Bare Metal Route</h3>
|
||||||
|
<p>Paperless runs on linux only. The following procedure has been tested on
|
||||||
|
a minimal installation of Debian/Buster, which is the current stable
|
||||||
|
release at the time of writing. Windows is not and will never be
|
||||||
|
supported.</p>
|
||||||
|
<p>Paperless requires Python 3. At this time, 3.10 - 3.12 are tested versions.
|
||||||
|
Newer versions may work, but some dependencies may not fully support newer versions.
|
||||||
|
Support for older Python versions may be dropped as they reach end of life or as newer versions
|
||||||
|
are released, dependency support is confirmed, etc.</p>
|
||||||
|
<ol>
|
||||||
|
<li>
|
||||||
|
<p>Install dependencies. Paperless requires the following packages.</p>
|
||||||
|
<ul>
|
||||||
|
<li><code>python3</code></li>
|
||||||
|
<li><code>python3-pip</code></li>
|
||||||
|
<li><code>python3-dev</code></li>
|
||||||
|
<li><code>default-libmysqlclient-dev</code> for MariaDB</li>
|
||||||
|
<li><code>pkg-config</code> for mysqlclient (python dependency)</li>
|
||||||
|
<li><code>fonts-liberation</code> for generating thumbnails for plain text
|
||||||
|
files</li>
|
||||||
|
<li><code>imagemagick</code> >= 6 for PDF conversion</li>
|
||||||
|
<li><code>gnupg</code> for handling encrypted documents</li>
|
||||||
|
<li><code>libpq-dev</code> for PostgreSQL</li>
|
||||||
|
<li><code>libmagic-dev</code> for mime type detection</li>
|
||||||
|
<li><code>mariadb-client</code> for MariaDB compile time</li>
|
||||||
|
<li><code>libzbar0</code> for barcode detection</li>
|
||||||
|
<li><code>poppler-utils</code> for barcode detection</li>
|
||||||
|
</ul>
|
||||||
|
<p>Use this list for your preferred package management:</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-9-1" name="__codelineno-9-1" href="#__codelineno-9-1"></a>python3 python3-pip python3-dev imagemagick fonts-liberation gnupg libpq-dev default-libmysqlclient-dev pkg-config libmagic-dev libzbar0 poppler-utils
|
||||||
|
</code></pre></div>
|
||||||
|
<p>These dependencies are required for OCRmyPDF, which is used for text
|
||||||
|
recognition.</p>
|
||||||
|
<ul>
|
||||||
|
<li><code>unpaper</code></li>
|
||||||
|
<li><code>ghostscript</code></li>
|
||||||
|
<li><code>icc-profiles-free</code></li>
|
||||||
|
<li><code>qpdf</code></li>
|
||||||
|
<li><code>liblept5</code></li>
|
||||||
|
<li><code>libxml2</code></li>
|
||||||
|
<li><code>pngquant</code> (suggested for certain PDF image optimizations)</li>
|
||||||
|
<li><code>zlib1g</code></li>
|
||||||
|
<li><code>tesseract-ocr</code> >= 4.0.0 for OCR</li>
|
||||||
|
<li><code>tesseract-ocr</code> language packs (<code>tesseract-ocr-eng</code>,
|
||||||
|
<code>tesseract-ocr-deu</code>, etc)</li>
|
||||||
|
</ul>
|
||||||
|
<p>Use this list for your preferred package management:</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-10-1" name="__codelineno-10-1" href="#__codelineno-10-1"></a>unpaper ghostscript icc-profiles-free qpdf liblept5 libxml2 pngquant zlib1g tesseract-ocr
|
||||||
|
</code></pre></div>
|
||||||
|
<p>On Raspberry Pi, these libraries are required as well:</p>
|
||||||
|
<ul>
|
||||||
|
<li><code>libatlas-base-dev</code></li>
|
||||||
|
<li><code>libxslt1-dev</code></li>
|
||||||
|
<li><code>mime-support</code></li>
|
||||||
|
</ul>
|
||||||
|
<p>You will also need these for installing some of the python dependencies:</p>
|
||||||
|
<ul>
|
||||||
|
<li><code>build-essential</code></li>
|
||||||
|
<li><code>python3-setuptools</code></li>
|
||||||
|
<li><code>python3-wheel</code></li>
|
||||||
|
</ul>
|
||||||
|
<p>Use this list for your preferred package management:</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-11-1" name="__codelineno-11-1" href="#__codelineno-11-1"></a>build-essential python3-setuptools python3-wheel
|
||||||
|
</code></pre></div>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Install <code>redis</code> >= 6.0 and configure it to start automatically.</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Optional. Install <code>postgresql</code> and configure a database, user and
|
||||||
|
password for paperless. If you do not wish to use PostgreSQL,
|
||||||
|
MariaDB and SQLite are available as well.</p>
|
||||||
|
<div class="admonition note">
|
||||||
|
<p class="admonition-title">Note</p>
|
||||||
|
<p>On bare-metal installations using SQLite, ensure the <a href="https://code.djangoproject.com/wiki/JSON1Extension">JSON1
|
||||||
|
extension</a> is
|
||||||
|
enabled. This is usually the case, but not always.</p>
|
||||||
|
</div>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Create a system user with a new home folder under which you wish
|
||||||
|
to run paperless.</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-12-1" name="__codelineno-12-1" href="#__codelineno-12-1"></a><span class="go">adduser paperless --system --home /opt/paperless --group</span>
|
||||||
|
</code></pre></div>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Get the release archive from
|
||||||
|
<a href="https://github.com/paperless-ngx/paperless-ngx/releases">https://github.com/paperless-ngx/paperless-ngx/releases</a> for example with</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-13-1" name="__codelineno-13-1" href="#__codelineno-13-1"></a><span class="go">curl -O -L https://github.com/paperless-ngx/paperless-ngx/releases/download/v1.10.2/paperless-ngx-v1.10.2.tar.xz</span>
|
||||||
|
</code></pre></div>
|
||||||
|
<p>Extract the archive with</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-14-1" name="__codelineno-14-1" href="#__codelineno-14-1"></a><span class="go">tar -xf paperless-ngx-v1.10.2.tar.xz</span>
|
||||||
|
</code></pre></div>
|
||||||
|
<p>and copy the contents to the
|
||||||
|
home folder of the user you created before (<code>/opt/paperless</code>).</p>
|
||||||
|
<p>Optional: If you cloned the git repo, you will have to
|
||||||
|
compile the frontend yourself, see <a href="../development/#front-end-development">here</a>
|
||||||
|
and use the <code>build</code> step, not <code>serve</code>.</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Configure paperless. See <a href="../configuration/">configuration</a> for details.
|
||||||
|
Edit the included <code>paperless.conf</code> and adjust the settings to your
|
||||||
|
needs. Required settings for getting
|
||||||
|
paperless running are:</p>
|
||||||
|
<ul>
|
||||||
|
<li><a href="../configuration/#PAPERLESS_REDIS"><code>PAPERLESS_REDIS</code></a> should point to your redis server, such as
|
||||||
|
<redis: localhost:6379>.</redis:></li>
|
||||||
|
<li><a href="../configuration/#PAPERLESS_DBENGINE"><code>PAPERLESS_DBENGINE</code></a> optional, and should be one of <code>postgres</code>,
|
||||||
|
<code>mariadb</code>, or <code>sqlite</code></li>
|
||||||
|
<li><a href="../configuration/#PAPERLESS_DBHOST"><code>PAPERLESS_DBHOST</code></a> should be the hostname on which your
|
||||||
|
PostgreSQL server is running. Do not configure this to use
|
||||||
|
SQLite instead. Also configure port, database name, user and
|
||||||
|
password as necessary.</li>
|
||||||
|
<li><a href="../configuration/#PAPERLESS_CONSUMPTION_DIR"><code>PAPERLESS_CONSUMPTION_DIR</code></a> should point to a folder which
|
||||||
|
paperless should watch for documents. You might want to have
|
||||||
|
this somewhere else. Likewise, <a href="../configuration/#PAPERLESS_DATA_DIR"><code>PAPERLESS_DATA_DIR</code></a> and
|
||||||
|
<a href="../configuration/#PAPERLESS_MEDIA_ROOT"><code>PAPERLESS_MEDIA_ROOT</code></a> define where paperless stores its data.
|
||||||
|
If you like, you can point both to the same directory.</li>
|
||||||
|
<li><a href="../configuration/#PAPERLESS_SECRET_KEY"><code>PAPERLESS_SECRET_KEY</code></a> should be a random sequence of
|
||||||
|
characters. It's used for authentication. Failure to do so
|
||||||
|
allows third parties to forge authentication credentials.</li>
|
||||||
|
<li><a href="../configuration/#PAPERLESS_URL"><code>PAPERLESS_URL</code></a> if you are behind a reverse proxy. This should
|
||||||
|
point to your domain. Please see
|
||||||
|
<a href="../configuration/">configuration</a> for more
|
||||||
|
information.</li>
|
||||||
|
</ul>
|
||||||
|
<p>Many more adjustments can be made to paperless, especially the OCR
|
||||||
|
part. The following options are recommended for everyone:</p>
|
||||||
|
<ul>
|
||||||
|
<li>Set <a href="../configuration/#PAPERLESS_OCR_LANGUAGE"><code>PAPERLESS_OCR_LANGUAGE</code></a> to the language most of your
|
||||||
|
documents are written in.</li>
|
||||||
|
<li>Set <a href="../configuration/#PAPERLESS_TIME_ZONE"><code>PAPERLESS_TIME_ZONE</code></a> to your local time zone.</li>
|
||||||
|
</ul>
|
||||||
|
<div class="admonition warning">
|
||||||
|
<p class="admonition-title">Warning</p>
|
||||||
|
<p>Ensure your Redis instance <a href="https://redis.io/docs/latest/operate/oss_and_stack/management/security/">is secured</a>.</p>
|
||||||
|
</div>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Create the following directories if they are missing:</p>
|
||||||
|
<ul>
|
||||||
|
<li><code>/opt/paperless/media</code></li>
|
||||||
|
<li><code>/opt/paperless/data</code></li>
|
||||||
|
<li><code>/opt/paperless/consume</code></li>
|
||||||
|
</ul>
|
||||||
|
<p>Adjust as necessary if you configured different folders.
|
||||||
|
Ensure that the paperless user has write permissions for every one
|
||||||
|
of these folders with</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-15-1" name="__codelineno-15-1" href="#__codelineno-15-1"></a><span class="go">ls -l -d /opt/paperless/media</span>
|
||||||
|
</code></pre></div>
|
||||||
|
<p>If needed, change the owner with</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-16-1" name="__codelineno-16-1" href="#__codelineno-16-1"></a><span class="go">sudo chown paperless:paperless /opt/paperless/media</span>
|
||||||
|
<a id="__codelineno-16-2" name="__codelineno-16-2" href="#__codelineno-16-2"></a><span class="go">sudo chown paperless:paperless /opt/paperless/data</span>
|
||||||
|
<a id="__codelineno-16-3" name="__codelineno-16-3" href="#__codelineno-16-3"></a><span class="go">sudo chown paperless:paperless /opt/paperless/consume</span>
|
||||||
|
</code></pre></div>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Install python requirements from the <code>requirements.txt</code> file.</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-17-1" name="__codelineno-17-1" href="#__codelineno-17-1"></a><span class="go">sudo -Hu paperless pip3 install -r requirements.txt</span>
|
||||||
|
</code></pre></div>
|
||||||
|
<p>This will install all python dependencies in the home directory of
|
||||||
|
the new paperless user.</p>
|
||||||
|
<div class="admonition tip">
|
||||||
|
<p class="admonition-title">Tip</p>
|
||||||
|
<p>It is up to you if you wish to use a virtual environment or not for the Python
|
||||||
|
dependencies. This is an alternative to the above and may require adjusting
|
||||||
|
the example scripts to utilize the virtual environment paths</p>
|
||||||
|
</div>
|
||||||
|
<div class="admonition tip">
|
||||||
|
<p class="admonition-title">Tip</p>
|
||||||
|
<p>If you use modern Python tooling, such as <code>uv</code>, installation will not include
|
||||||
|
dependencies for Postgres or Mariadb. You can select those extras with <code>--extra <EXTRA></code>
|
||||||
|
or all with <code>--all-extras</code></p>
|
||||||
|
</div>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Go to <code>/opt/paperless/src</code>, and execute the following command:</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-18-1" name="__codelineno-18-1" href="#__codelineno-18-1"></a><span class="c1"># This creates the database schema.</span>
|
||||||
|
<a id="__codelineno-18-2" name="__codelineno-18-2" href="#__codelineno-18-2"></a>sudo<span class="w"> </span>-Hu<span class="w"> </span>paperless<span class="w"> </span>python3<span class="w"> </span>manage.py<span class="w"> </span>migrate
|
||||||
|
</code></pre></div>
|
||||||
|
<p>When you first access the web interface you will be prompted to create a superuser account.</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Optional: Test that paperless is working by executing</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-19-1" name="__codelineno-19-1" href="#__codelineno-19-1"></a><span class="c1"># Manually starts the webserver</span>
|
||||||
|
<a id="__codelineno-19-2" name="__codelineno-19-2" href="#__codelineno-19-2"></a>sudo<span class="w"> </span>-Hu<span class="w"> </span>paperless<span class="w"> </span>python3<span class="w"> </span>manage.py<span class="w"> </span>runserver
|
||||||
|
</code></pre></div>
|
||||||
|
<p>and pointing your browser to http://localhost:8000 if
|
||||||
|
accessing from the same devices on which paperless is installed.
|
||||||
|
If accessing from another machine, set up systemd services. You may need
|
||||||
|
to set <code>PAPERLESS_DEBUG=true</code> in order for the development server to work
|
||||||
|
normally in your browser.</p>
|
||||||
|
<div class="admonition warning">
|
||||||
|
<p class="admonition-title">Warning</p>
|
||||||
|
<p>This is a development server which should not be used in production.
|
||||||
|
It is not audited for security and performance is inferior to
|
||||||
|
production ready web servers.</p>
|
||||||
|
</div>
|
||||||
|
<div class="admonition tip">
|
||||||
|
<p class="admonition-title">Tip</p>
|
||||||
|
<p>This will not start the consumer. Paperless does this in a separate
|
||||||
|
process.</p>
|
||||||
|
</div>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Setup systemd services to run paperless automatically. You may use
|
||||||
|
the service definition files included in the <code>scripts</code> folder as a
|
||||||
|
starting point.</p>
|
||||||
|
<p>Paperless needs the <code>webserver</code> script to run the webserver, the
|
||||||
|
<code>consumer</code> script to watch the input folder, <code>taskqueue</code> for the
|
||||||
|
background workers used to handle things like document consumption
|
||||||
|
and the <code>scheduler</code> script to run tasks such as email checking at
|
||||||
|
certain times .</p>
|
||||||
|
<div class="admonition note">
|
||||||
|
<p class="admonition-title">Note</p>
|
||||||
|
<p>The <code>socket</code> script enables <code>granian</code> to run on port 80 without
|
||||||
|
root privileges. For this you need to uncomment the
|
||||||
|
<code>Require=paperless-webserver.socket</code> in the <code>webserver</code> script
|
||||||
|
and configure <code>granian</code> to listen on port 80 (set <code>GRANIAN_PORT</code>).</p>
|
||||||
|
</div>
|
||||||
|
<p>These services rely on redis and optionally the database server, but
|
||||||
|
don't need to be started in any particular order. The example files
|
||||||
|
depend on redis being started. If you use a database server, you
|
||||||
|
should add additional dependencies.</p>
|
||||||
|
<div class="admonition note">
|
||||||
|
<p class="admonition-title">Note</p>
|
||||||
|
<p>For instructions on using a reverse proxy,
|
||||||
|
<a href="https://github.com/paperless-ngx/paperless-ngx/wiki/Using-a-Reverse-Proxy-with-Paperless-ngx#">see the wiki</a>.</p>
|
||||||
|
</div>
|
||||||
|
<div class="admonition warning">
|
||||||
|
<p class="admonition-title">Warning</p>
|
||||||
|
<p>If celery won't start (check with
|
||||||
|
<code>sudo systemctl status paperless-task-queue.service</code> for
|
||||||
|
paperless-task-queue.service and paperless-scheduler.service
|
||||||
|
) you need to change the path in the files. Example:
|
||||||
|
<code>ExecStart=/opt/paperless/.local/bin/celery --app paperless worker --loglevel INFO</code></p>
|
||||||
|
</div>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Optional: Install a samba server and make the consumption folder
|
||||||
|
available as a network share.</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Configure ImageMagick to allow processing of PDF documents. Most
|
||||||
|
distributions have this disabled by default, since PDF documents can
|
||||||
|
contain malware. If you don't do this, paperless will fall back to
|
||||||
|
Ghostscript for certain steps such as thumbnail generation.</p>
|
||||||
|
<p>Edit <code>/etc/ImageMagick-6/policy.xml</code> and adjust</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-20-1" name="__codelineno-20-1" href="#__codelineno-20-1"></a><policy domain="coder" rights="none" pattern="PDF" />
|
||||||
|
</code></pre></div>
|
||||||
|
<p>to</p>
|
||||||
|
<div class="highlight"><pre><span></span><code><a id="__codelineno-21-1" name="__codelineno-21-1" href="#__codelineno-21-1"></a><policy domain="coder" rights="read|write" pattern="PDF" />
|
||||||
|
</code></pre></div>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Optional: Install the
|
||||||
|
<a href="https://ocrmypdf.readthedocs.io/en/latest/jbig2.html">jbig2enc</a>
|
||||||
|
encoder. This will reduce the size of generated PDF documents.
|
||||||
|
You'll most likely need to compile this by yourself, because this
|
||||||
|
software has been patented until around 2017 and binary packages are
|
||||||
|
not available for most distributions.</p>
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
<p>Optional: If using the NLTK machine learning processing (see
|
||||||
|
<a href="../configuration/#PAPERLESS_ENABLE_NLTK"><code>PAPERLESS_ENABLE_NLTK</code></a> for details),
|
||||||
|
download the NLTK data for the Snowball
|
||||||
|
Stemmer, Stopwords and Punkt tokenizer to <code>/usr/share/nltk_data</code>. Refer to the <a href="https://www.nltk.org/data.html">NLTK
|
||||||
|
instructions</a> for details on how to
|
||||||
|
download the data.</p>
|
||||||
|
</li>
|
||||||
|
</ol>
|
||||||
Reference in New Issue
Block a user