Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Das Projekt enthält ein Dockerfile so dass aus dem Projket Projekt ein Docker-Container gebaut werden kann.

DI-Netz: ddb-di-vm08, ausgecheckt in /data/ddb/tools /automaticClassification/git(git clone https://dev.fiz-karlsruhe.de/stash/scm/apd/ise-apd-classification.git)

Docker installiert https://docs.docker.com/engine/install/

Python installieren?

Autostart: 

Code Block
sudo systemctl enable docker.service
sudo systemctl enable containerd.service
sudo usermod -a -G docker admin #um den Dockerdaemon ansprechen zu dürfen

Default-Directory

Ist normalerweise /var/lib/docker. Dort gibt es aber nicht genug Plattenplatz. Darum in /usr/lib/systemd/system/docker.service in die Zeile
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
-g ... einfügen:
ExecStart=/usr/bin/dockerd -g /data/ddb/tools/docker -H fd:// --containerd=/run/containerd/containerd.sock
und service docker neu starten

Proxy:

Als root neues Verzeichnis + Datei /etc/systemd/system/docker.service.d/http-proxy.conf anlegen
Inhalt:


Code Block
[Service]
Environment="HTTP_PROXY=http://proxy.fiz-karlsruhe.de:8888"
Environment="HTTPS_PROXY=http://proxy.fiz-karlsruhe.de:8888"
Environment="NO_PROXY=localhost,127.0.0.1"


Zusätzlich neues Verzeichnis + Datei <home>/.docker/config.json
Inhalt:

Code Block
{
 "proxies":
 {
   "default":
   {
     "httpProxy": "http://proxy.fiz-karlsruhe.de:8888",
     "httpsProxy": "http://proxy.fiz-karlsruhe.de:8888",
     "noProxy": "*.test.example.com,.example2.com,127.0.0.0/8"
   }
 }
}


Build + Run:

Also Refer to https://dev.fiz-karlsruhe.de/stash/projects/APD/repos/ise-apd-classification/browse

Code Block
titleBuild
cd /data/ddb/tools/ise-apd-classification
docker build -t apdweimar:latest .

Run:

Adapt configuration (Endpoints etc) in /data/ddb/tools/ise-apd-classification/config.py

Load Data:

  • Either from API (ddbIds are contained in /data/ddb/tools/ise-apd-classification/data/onjectIds.csv) or from Files (contained in /data/ddb/tools/ise-apd-classification/data/dump)
  • objectIds or Files have to get delivered by Archives.

  • Code Block
    titleLoad Data
    cd /data/ddb/tools/ise-apd-classification
    
    #From API
    docker run --rm -v "$(pwd)"/data:/usr/src/app/data --name APDclassifier apdweimar:latest python loadData.py
    
    #From Files (Path to dumpfiles is defined in /data/ddb/tools/ise-apd-classification/config.py), doesnt work with Files from git, wrong keywordIds are generated (eg c-102 instead of 0s)
    docker run --rm -v "$(pwd)"/data:/usr/src/app/data --name APDclassifier apdweimar:latest python loadData.py --local
    
    #Output is in /data/ddb/tools/ise-apd-classification/data/input


Create Suggestions:


  • Code Block
    titleCreate Suggestions
    cd /data/ddb/tools/ise-apd-classification
    docker run --rm -v "$(pwd)"/data:/usr/src/app/data --name APDclassifier apdweimar:latest python classification.py -m TfIdfClassifier -s data/models/tfidf.pickle data/output/predictions.csv
    # Maybe use different model (-m) and load existing trained model instead of creating a new one (-l instead of -s)


  • WARNING: /usr/local/lib/python3.9/site-packages/sklearn/utils/validation.py:593: FutureWarning: np.matrix usage is deprecated in 1.0 and will raise a TypeError in 1.2. Please convert to a numpy array with np.asarray. For more information see: https://numpy.org/doc/stable/reference/generated/numpy.matrix.html
  • Which model to use? When to save a new trained model?

Load Suggestions:


  • Code Block
    titleLoad Suggestions
    # Configure Assignment-Tool-Backend URL in /data/ddb/tools/ise-apd-classification/config.py
    docker run --rm -v "$(pwd)"/data:/usr/src/app/data --name APDclassifier apdweimar:latest python uploadPredictions.py data/output/predictions.csv
    
    # If old suggestions should not get deleted:
    docker run --rm -v "$(pwd)"/data:/usr/src/app/data --name APDclassifier apdweimar:latest python uploadPredictions.py --keep data/output/predictions.csv 

    FEHLER beim Purge: 2022-04-07 09:46:30.500 ERROR [d.f.d.a.s.AssignmentToolBackendExceptionMapper] - URL: http://dev-apd.fiz-karlsruhe.de/assignment-tool-backend/keyword-relation, Method: DELETE, Message: NotAllowedException: HTTP 405 Method Not Allowed