ce43457d71 | ||
---|---|---|
.github/ISSUE_TEMPLATE | ||
config | ||
scripts | ||
static | ||
templates | ||
.gitattributes | ||
.gitgnore | ||
.slugignore | ||
LICENSE | ||
Procfile | ||
README.md | ||
requirements.txt | ||
robots.txt | ||
termic.py |
README.md
termic: an alternative to Microsoft Terminology Search
A deployed and ready-to-use version of termic is available on https://termic.me.
Data Collection
Data used by termic is available for download on Dropbox.
Translation Memory
The 2020+ translation memory was retrieved from Visual Studio Dev Essentials as .csv files. Those files were merged using a custom script, merge_csv.py, that is available in the termic-data GitHub project.
To download Microsoft's translation memory, follow these steps:
- Go to Visual Studio Dev Essentials.
- Search "Translation and UI Strings Glossaries September 2020" in the "Search downloads" search bar.
- The link should appear as you start typing; click on it and search.
- Choose your language on the right and click on the "Download" button.
In addition, this dataset was expanded with VSCode strings (which are not available in the TM provided by Microsoft). vscode_data.py was used for extraction.
Glossaries
Glossaries were retrieved from the Microsoft Terminology Collection. Those are .tbx files that were converted to .xlsx using Xbench.
Requirements
Website
- Python (>=3.10)
- Flask (>=2.3.1)
- psycopg2 (>=2.9.6)
Scripts
- Python (>=3.10)
- requests (>=2.29.0)
- pandas (>=2.0.1)
Usage
git clone https://github.com/spidersouris/termic.git
cd termic
pip install -r requirements.txt
python termic.py
- Go to http://localhost:5000
Using a database
If you want to run termic locally with the data available for download on Dropbox or with your own terminology data, using a local database is recommended.
You can change the connection string in config/db_config.py to connect to your database.
Using local data files
Using local data files with termic is far from being ideal. However, if it is your only option, here are a few tips:
- You can use pandas (or any other alternative Python data processing librairies) to process the .csv (translation memory) and .xlsx (glossary) files.
- If you do use pandas, consider converting the .csv and .xlsx files to the binary feather format to reduce disk usage and improve search time. To do so, you can use the convert_to_feather.py script, that is available in the termic-data GitHub project.
- You can use termic_pandas.py as a basis for your local deployment.
Mail server
termic comes with Flask-Mail, which means that you can configure a SMTP endpoint to receive messages sent via the contact form (on the /about route by default).
This requires setting up the following environment variables (see config/mail_config.py for more information): MAIL_SERVER
, MAIL_PORT
, MAIL_USERNAME
, MAIL_PASSWORD
, MAIL_SENDER
, MAIL_RECIPIENTS
.
If any of these environment variables are undefined, the mail service will be disabled and users won't be able to send emails.
Development
flask --app termic run --debug
Deployment
You can use Heroku for deployment.
heroku login
heroku create --app termic
git push heroku master
Contributors
- benediktkr
- MK