Introduction

Simple Sentence Mining (ssmtool) is a universal, cross-platform, multilingual, clipboard-based tool that helps you generate Anki flashcards from whatever source. It is free software under the GNU GPLv3 license, meaning you are free to use, modify, and distribute it, even commerically, as long as it remains under the same license.

It is designed to be simple to set up and straightforward, with the ultimate goal of enabling smooth sentence mining during immersion study. Originally, I wanted to make an Anki add-on, however, I noticed several issues with that idea:

  • Anki has no stable API, and addons break regularly on every release, making it difficult to maintain.
  • I would like the tool to be extensible for other, potentially better flashcard tools in the future.

Feature set

Easy word lookups

You can simply copy anything to the clipboard. If it is a single word, it will be looked up right away, which allows you to use it as a dictionary app too.

Otherwise, when the copied content appears in the “Sentence” field, you can simply double click it to look it up. For languages not using spaces, you will have to select it manually, and then either press the button or use the keyboard shortcut. Better support for Chinese and Japanese is expected in the near future.

Configuration-free dictionaries

From version 0.2.0 ssmtool comes with three dictionaries out of the box

  • English Wiktionary wikt-en (Definitions are in English)
  • Google Translate gtrans (Translating to English, support for translating to other languages expected in the near future) - No API keys required
  • Google Dictionary gdict (Monolingual, i.e., definitions in the same language)

There will be support for local dictionaries soon.

Lemmatization

Many similar tools such as LingQ, Readlang, Learn with Texts does not have the ability to lemmatize words, which is to convert words into their dictionary forms. The issue with this is that many dictionaries do not have all forms of a word listed, especially in highly inflectional languages (Russian, Finnish). With lemmatization, dictionary lookups become much smoother because there is no longer a need to manually modify the word before looking it up.

Anki integration

This program leverages AnkiConnect to add cards directly to Anki, without using csv exports. Tags and custom note layouts are also supported.

Browser integration

The Click Copy Sentence extension is available for Firefox and Chrome (incl. derivatives).

This extension enables you to mine sentences with a single click on most websites, further smoothening the immersion experience.

Language Support Matrix

Language lemma wikt-en gdict gtrans “Support level”1
English Full
Russian Full
Chinese Partial
Italian Full
Finnish Partial
Japanese Partial
Spanish Full
French Full
German Full
Latin Partial
Polish Partial
Portuguese 2 Full
Serbo-Croatian Partial
Dutch Partial
Romanian Partial
Hindi Partial
Korean Partial
Arabic Partial
Turkish Full

Installation

There are three components you need to install to make this program work.

Main Desktop Application

GNU/Linux

If you are using Gentoo, you can emerge the package app-misc/ssmtool from guru overlay.

Otherwise, you can use pip install ssmtool (or pip3 if appropriate)

On Linux the appearance of the app depends on the system Qt theme.

Packaged versions for Arch Linux will be created in the near future.

Windows and macOS

Even though we encourage the use of free software, nonfree operating systems are still supported by this tool. You can go to the Github Releases for standalone versions.

Alternatively, or if you would like a development version, you can also run the same pip command above.

Only 64 bit Windows 7+ is supported. On Windows you need the Microsoft Visual C++ Redistributable Package, which you may or may not already have.

Anki Add-on (Required for card creation)

Download and install Anki dekstop (Not mobile or Anki Universal). Skip if you already installed it.

Then, install the AnkiConnect addon. You do not have to change any settings for it.

Mac users: You must have Anki open on the foreground (i.e. visible on your desktop), or otherwise disable the App Nap feature. If you do not do this, AnkiConnect will not respond and will cause this program to be very slow and/or unresponsive.

Install extension for your browser:

Configuration

You need to first select a target language from the list. Then, you can select a dictionary. We recommend using Google translation only if the other two are not available, because translations are always less detailed than dictionary definitions and may not provide the full range of meanings needed.

We recommend leaving lemmatization on as it is by default. It can greatly boost dictionary coverage for many languages.

Next, on the Anki tab, you will see a number of settings. You usually do not have to change the first one, which is the API endpoint, unless you configured a different endpoint in AnkiConnect, but in that case you will know how to do this. You should then select a deck, note type, and match data fields into note fields. To do this you must have a note type with at least three fields, one each for Sentence, Word, and Definition.

If you do not have one, you can download it from here and import it into Anki. Then, select “ssmtool-reading” and simply match the field names.

You’re done! Now you are ready to mine sentences.

Usage

General

When you see any sentence from anywhere, you can simply copy it to clipboard. It will appear on the Sentence field right away. Then, double click on any word, and you will get the definition. You can look up words from the Definition field too. Then, when you are satisfied with the data, click on Add Note to send it to Anki.

Note: It seems that on macOS clipboard change may not always be detected. In that case simply use the “Read clipboard” button.

Browser

When you turn on the extension, you will notice that sentences are underlined in green (will become configurable in the future). Whenever you click on any word, ssmtool will receive both the whole sentence and the word under your cursor. The word will be looked up immediately too. Chances are, with lemmatization on, this is exactly the word you want. In that case, just press Ctrl/Cmd + S, and you can keep reading!

mpv

You can use this tool in combination with mpv with the mpvacious plugin to make it copy subtitles to clipboard continuously. Audio data is currently not supported, but may be added in the near future either as an extension to ssmtool, or as a standalone mpv plugin.

AwesomeTTS

Currently ssmtool does not support directly adding audio during card creation. However you can easily get the audio during review with AwesomeTTS Anki addon. After configuring it, you can go to Tools > Manage Note Types > (select your note type) > Cards > Add TTS Then, select the Sentence field, the Word field, or both. Refold recommends putting audio on the back of the card, but it is also possible to put it on the front side.

Feedback

You are welcome to report bugs, suggest features/enhancements, or ask for clarifications by opening a GitHub issue.

We have an official chatroom on Matrix

You are also welcome ask _z#6358 on Refold Discord for help.


  1. “Full support” simply means the full feature-set is available. Usability can depend on a lot of other factors, such as dictionary coverage and lemmatizer accuracy, both of which can vary depending on the language. ↩︎

  2. Only the Brazillian variant. It will be selected automatically if Google dictionary is chosen ↩︎