Naidheachdan - News
2014 January February March
April May June July August September October November
2013 January February March April
May June July August September October
2012 January February March April May
June July August September October November December
2011 January February March
April May June July August September
October November December
2010 January February March April May June July August September October November December
2009 April May June July August
September October November December
¿Qué pasa? Not a bad question. Quite a few things actually. The
dictionary continues to grow and now has just over 60,000 entries.
Not bad, even if we say so ourselves. As a result of this growth,
there are of course knock-on effects. The most immediate is that the
upcoming version of the Dearbhair Beag (version 2.8) will
contain 877,365 forms. The single biggest collection of Gaelic words
ever. And further down the line, this growth also fills out other
word-based resources, such as the files for the Gaelic version of Scrabble3D -
which incidentally now offers you games at three levels: basic (the
2000 most commonly used words), advanced (with all words in the Faclair
Beag) and crazy (with all of Dwelly’s words) - and a few other
things, such as GCompris,
a suite of educational games for children which also include word
A partial extract of our data is also helping two other great
improve their search functionalities with regard to other Gaelic
dictionaries which don’t have an inbuilt lemmatizer (sorry,
geek-speak, a tool that leads you from thaighean to the entry for
taigh). So small as Gaelic may be, in terms of dictionary resources,
we’re probably punching above our weight in some ways. None of the
Irish dictionaries for example have a lemmatizer. So there :)
We’re also about to enter into a collaboration with the Gaelic
Corpus project at Edinburgh University who are going to use out
lexical database to help speed up the tagging of the corpus. So
another nice side effect of the Faclair Beag.
In other news we are continuing with the cleanup of the dictionary
and the to-do list is fortunately shrinking. Just this month I
finally completed replacing all the placeholders which were
inherited from the Faclair nan Gnàthasan-cainnte (things
like [N] or [ADJ]) with full examples. Yes, there were indeed
thousands of them. Hence the dram I’m having just now! Well, cup of
darjeeling, but I might have a dram later on...
If you have any great ideas on what one could do with such a
wordlist, do let us know, we don’t bite!
So, what’s new I hear you asking? Not a massive amount. We’re
slightly past 54,000 entries, which is nice and we’re looking
forward to the 55,000 milestone. One of the things that have kept
Michael busy this summer was the remake of the iGàidhlig
website, a one-stop shop for Gaelic software. Anything from
predictive texting to digital whiteboards, advice on how to type
accents to installing LibreOffice or Microsoft Office in Gaelic.
Apart from being a lot more graphic, the new site also is bilingual,
which is nice if you want to point IT support at the site (since
they often don’t speak Gaelic). Give it a go, it’s braw!
As we’re heading into the long, dark nights of the Scottish winter,
why not put aside the sudoku and help us try and figure out some of these words which we can’t quite
figure? Go on, you know you like a challenge!
Still having intermittent problems with the hosting, I’ll spare you
the details but we are trying to resolve the issue. Bear with us!
Under the ’hood
2012 was a pretty horrendous year in terms of workload, both for
Will and me so apart from adding steadily to the dictionary (there
are now just over 50,000 entries - entry 50,000 having been dìthean-caorach),
there hasn't been a huge amount of stuff apart from the predictive
texting and the maps. Ok, so maybe not so minor. Anyway, we’re
getting back into the swing and fixed some under-the-hood stuff that
won't affect you as dictionary users on the whole but that will make
life easier for the editor. One visible change is that the sound
player now only appears when there actually is a sound file (that
had been annoying many people, sorry about that).
We’ve got plans for a lot of stuff, like better fuzzy searches (the
current system is just too Anglo-centric), an RSS feed, maybe a
tooltip dictionary tool... we’ll see, time and work permitting!
Minor Update: We’re now also showing negative
votes, that is, words where a votes has stated that they do
not know a word in question. There aren’t anywhere near as many of
them as there are positive votes but it can still be useful to see
that a word is not used by people or in a certain are. In the word druid
Guess what? Yup, we’ve used and abused the Faclair
Beag again, this time to create a predictive texting tool for
Gaelic. In a way it’s a totally logical step and in a way slightly
left-field so let me expand a little on the convoluted history.
There are two main strands to this. Anyone who has ever tried to
text in Gaelic knows what a chore it is to do it letter by letter.
It’s slow at best, even slower if you're dìcheallach and try
to put in the accents. And rather frustrating because, depending on
your phone, the system will keep trying to “correct” your Gaelic to
English so if you're now careful, tha very easily becomes the.
So given the enduring popularity of texting, a language in a
technologised country today cannot really afford not to have
predictive texting. You’d think so, wouldn’t you? Seems like arts
projects are SO much more popular when it comes to funding... but
anyway. So I’ve always kept an eye open for an opportunity.
One thing I did not want to do is fall into the trap that the Irish
did some years ago with Téacs. It was a good idea in the sense that
predictive texting was needed for Irish. But the mistake was to go
with a custom built solution. Doing something like this means two
things: 1) You tie yourself to a development process to ensure that
new phones are properly supported, bugs are fixed and new features
(if needed) developed and 2) You tie the end-user to your specific
Is that so bad? Yes and no. The problem with Téacs was that
it only offered Irish. A bit duh because we all know that
one thing bilingual people do is switch between languages a lot. So
what about English? And the second problem was that they didn't
maintain the program properly which meant that it very soon became
very out-of-date. So their list of supported phones in 2012 looks
rather embarrassing (also the year the site went offline).
So I knew that if this was going to work long-term, it would most
likely mean joining a bigger project which was open to new
languages. And in July last year my periodic trawling of the web for
such a project finally threw up a result - a project called Adaptxt
which had just gone Open Source.
This is where the other strand of the story comes in. At the most
basic level, predictive texting just relies on a wordlist but many
such tools, including Adaptxt, are much smarter than that and try to
rank words according to how common they are (so you’re not offered interregnum
when you’re looking to type internet). And in the case of
Adaptxt, they also try to get smart about predicting the next word
you’re likely to type. In the case of Gaelic this means that if you
type Bha, you’re offered mi as a likely next word.
Sure, Adaptxt is capable of learning but who wants to train their
phone from scratch every time they buy a new one?
Back in 2009 I was in Dublin on a research trip courtesy of Bòrd na
Gàidhlig to look into speech
and language technology for Gaelic. It just so happened that a
guy called Kevin Scannell (who
is a professor of computing at the University of St Louis and into
Irish and Irish software big time) was in town and we met up in the
Club Chonradh na Gaeilge (highly
recommended!). The evening's a bit of a blur but I walked away with
loads of notes on things like “lexical databases“ and “web crawlers”
and whatnot. Shortly thereafter, Kevin successfully reeled me into
the Firefox translation project and stuff just snowballed from there
in terms of Gaelic software translation. Another spinoff was the
Faclair Beag itself. Not so much the idea of it but how we did it.
So instead of just doing a really primitive “list” based dictionary
(e.g. with cù on one side and dog on the other), we
got kinky and took onboard much of what Kevin had extolled. As a
result, the Faclair Beag not only knows that cù = dog but
also that cù is the nominative singular of a masculine noun,
that chù is the lenited form, that coin is the
genitive singular of a masculine noun... and so on. This means it’s
not only real smart when it comes to finding the right word if you
put on something else that the citation form (i.e. if you put in conaibh
instead of cù for example) but it also allows us to build
lots of sexy stuff on the back of it.
Because marrying this with Adaptxt was beyond my ken of the sgoil-dhubh
of programming, I got in touch with Kevin and asked if he’d be up
for doing a joint project to create state of the art predictive
texting for Irish and Scottish Gaelic - which he was! And in the
spirit of Goidelic brotherhood, we also decided to do Manx Gaelic at
the same time. So we took the Gaelic data from the Faclair Beag,
Kevin then ranked each word using a massive text corpus he has (so
for example, conaibh is out, but tha is right there
at the top) and sent me that data. I then invested a certain amount
of sweat, coffee and patience and to cut this long story short, here
we are. So, interested? You can just search for “Adaptxt” in Google
Play (the installation isn’t hard) but if you want a step by step
illustrated guide, here's
one here I did earlier. An dòchas gun còrd e ribh!
DRINK ME, EAT ME!
The Faclair has gone Lewis Carroll. All entries have been
replaced with white rabbits. Nah, don't panic. Will has improved the
use of the screen size by making the column width adjust to the size
of your screen and/or window. So if you're on a tiny 11" monitor you
probably won't see that much difference but say you're on a massive
70" monster, you'll find that most entries will display over three
lines max, saving you a lot of vertical scrolling:
Alternatively, it also works if you resize your browser window, for
example if you want to have it side by side with a text document
you're working on. Like this for example:
Enjoy - and keep the ideas coming!
Maps, glorious maps
For all those who have been wondering about why we’re collecting all
these votes, there’s finally an answer! They feed our map tool. Like
traditional dialect maps, these give you an indication of where
words are used. Like these two (one of my all-time favourites):
Not all of them are quite that detailed yet but we’re working on it!
So how do you view the maps? Easy, just search for a word and click
on the blue underlined word, for example in the above case, feum and mand.
The map will come up and display any votes and also a link to the Help page
for the maps (which has more detailed info). Oh, and while most
votes are in Scotland, there are some cropping up abroad, especially
in Nova Scotia!
There's now also a mobile version of the Faclair at www.faclair.com/m - same as
the desktop version really but we've collapsed the Advanced Search
feature and used a smaller logo to save space on screen. It's
specifically for mobile phones but if you're on a slow connection,
there's no reason why you couldn't use it on a desktop too but note
that if you're a user with voting rights, you can't get that feature
in the mobile version.
You can also put a link on your mobile phone's desktop now to get to
it real quick. You need to do the following:
1) Android Phones
a) Bookmark the page in the phone's default
b) Go to your Bookmarks and press and
hold and when the options come up, tell it to Add shortcut to
Home. That's it.
a) Go to the page and tap the Bookmark icon
b) When the menu comes up, press Add to Home
Screen. That's it.
A Gaelic Scrabble
Before you ask what Scrabble has to do with the Faclair - it's
yet another one of those interesting uses you can put a database of
words to. With a bit of tidying up (to remove names and other proper
nouns), it's not that hard to build a dictionary file for something
like Scrabble. Want to have
Am Faclair Beag on LearnGaelic
Well, who would have thought that? After lots of meetings and more
draft documents flitting backwards and forwards through cyberspace,
MG Alba have bought, that's right, bought, a license to use our
dictionary data on their new LearnGaelic website.
Though Tahiti is still not an option, this is certainly a welcome
Check your spelling, sir?
One of the first spin-offs we've been working on is a selection
of Gaelic spellcheckers. There's a couple out there already but,
well, let's just say they're not being maintained well.
So, by using the database in the Faclair, we've been able to join up
with an Open Source project called Hunspell and script druid called
Kevin Scannell to create
spellchecking tools which will work in Mozilla Firefox and Thunderbird, Opera and LibreOffice/OpenOffice. If
you're using the Gaelic version of Firefox/Thunderbird and
LibreOffice, the spellcheckers already come bundled with the
software but if you're using the English version, you can get the
Mozilla spellchecker here
and the LibreOffice one here
(also works in OpenOffice).
By co-operating with other projects in this way, we can ensure that
both the software and the spellchecking dictionary will be
maintained properly and regulary, which means:
- neither will become buggy or stop running on new operating
- we can easily fix errors and add new data from the dictionary
Oh and they're all free of charge!
The Faclair at Rannsachadh na Gàidhlig in Aberdeen
We kind of left it a bit late registering but ended up doing a
paper nonetheless. Well.. I say paper. It was mostly a presentation
really on the timeline of our two dictionaries, starting with the
digitisation of Dwelly's, the birth of the Faclair Beag and the
planned spin-off projects, such as spellcheckers and predictive
texting and so on. Perhaps not high-brow academic as such but I feel
it was a worthwhile paper nonetheless because it shows what you can
do with a properly built lexical database - even a relatively simple
It was well received and one of the member of the audience made
me laugh, he came up to me and said "Don't take this the wrong way
- but only a German could have done this". He then explained that
it was the clear sense of direction of the dictionary project, its
execution and logical progression which had prompted him to make
this amusing compliment. Ach well, the Gàidheileamailtich score
If you want to see the presentation (it's in Gaelic), you can get
the PDF here.
A dictionary is born!
We always said that Dwelly-d would be just the start and so, mar
a chanas iad, 's e gnìomh a dhearbhas - here you are. It's
called the Faclair Beag because, well, it's kinda small still even
though we have big plans for it. So bear with us for now if you find
gaps - but other than that, we hope you find it useful in using or