Go to file
IC-EnzoD-FRA af955efdea
Update README.md
2023-07-04 10:33:28 +02:00
csv_to_merge Traslation memory for nb-NO 2023-06-15 01:16:43 +02:00
data Traslation memory for nb-NO 2023-06-15 01:16:43 +02:00
feather Initial commit 2023-05-09 19:30:29 +02:00
scripts General Python improvements and fixes 2023-06-15 01:16:55 +02:00
README.md Update README.md 2023-07-04 10:33:28 +02:00

README.md

termic-data

Warning
This repo has been archived as termic data is now stored on Dropbox. Scripts are now available in the main termic repo.


This repository contained Microsoft's translation memory and glossary files used by termic. Check the termic repo for more information on data collection.

Structure

  • /csv_to_merge: seperate .csv translation memory files for each language,
  • /data: merged .csv translation memory files (see merge_csv.py) and .xslx glossary files for each language,
  • /feather: examples of pandas feather file (for local data),
  • /scripts:
    • convert_to_feather.py: use this script to convert .csv and .xlsx files to the feather format,
    • merge_csv.py: use this script to merge the .csv files in the csv_to_merge folder into one .csv file.

Usage

NOTE: was written specifically for merge_csv.py.

Set up a virtualenv:

mkdir -p ~/.cache/virtualenvs
python3 -m venv ~/.cache/virtualenvs/termic-data
source ~/.cache/virtualenvs/termic-data/bin/activate
python3 -m pip install -r ./scripts/requirements.txt

Contributors