Dec. 9th, 2013

alexpgp: (St. Jerome w/ computer)
The tail end of that item I was working on the past couple of days morphed into one of those texts that has terminology that is in none of my references, and that Google can only find... in copies of the very same text located on various sites on the Internet.

Not very helpful, let me tell you!

So I did my best, explaining what I could using small words.

* * *
It's become difficult to keep track of just what BitTorrent Sync is doing on my Raspberry Pi microserver, so I've created a cron job that runs du on the appropriate subdirectory tree and compares the output against the state of said tree 24 hours ago (using diff). Things appear to have steadied now that pretty much all of the files in the directories being synchronized have been duplicated. I had been concerned that, at various times over the past few days, BTS did not seem to be doing much of anything and it would turn out that the service had stopped, for some reason, on the Pi.

As a data point, as long as ownCloud doesn't have gigabytes of data queued up to go through the pipe, it's faster at moving files from Here to There than BTS.

* * *
The next step in the Master Glossary Search Setup™ was to try to get a better handle (or any handle at all) on doing for French what my 18-year-old code does for Russian. The first step was just to get things to find things and display them properly. The latter did not happen despite the incoming HTML metadata declaring the following content to be UTF-8, because the Chrome feature that automatically determines a page's encoding insisted the text of the page was ISO-8859-1. Specifically pointing Chrome in the right direction ("Hey, this page is in UTF-8!") got things to display correctly for French, but then Chrome needed to be "repointed" back to CP-1251 to render the Russian search page properly. I'll figure it out eventually...

One interesting step is going to involve French search strings that automagically search for accented characters even when they are not used in the search string (or when they do not appear where they should in the database). For example, using the query string abetir when looking for abêtir, or having the right definition come up if the query uses the circumflex but no circumflex appears in the glossary entry.

This is, in my opinion, useful in cases where one does not have convenient access to a method of entering accented characters, or where databases are not quite up to snuff (for one reason or another). I already have this feature in place in the Russian search routines, which treat the 'e' and the 'ё' characters as the same character (since there is no really standard usage defined for the latter).

Onward!

Profile

alexpgp: (Default)
alexpgp

January 2018

S M T W T F S
  1 2 3456
7 8910111213
14 15 16 17181920
21222324252627
28293031   

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Aug. 9th, 2025 09:41 am
Powered by Dreamwidth Studios