Wikia

Familypedia

Familypedia:Multilingual articles

Talk2
168,523pages on
this wiki

Redirected from Genealogy:Multilingual articles

IntroductionEdit

Familypedia is open to contributors from all languages. The number of articles in each language is determined mostly by how many contributors there have historically been for that language. Familypedia needs administrators representing each of the world's languages. Please contact User:AMK152 or User:Robin Patterson or any other administrator if you have interest in helping contributors using languages that are under-represented on Familypedia.


Articles in the Main, Template, Help, or Project namespace are not assumed to be in English; they may be in any language and have any title, including those using non-Latin fonts, such as Arabic, Chinese, Cyrillic, or Sanskrit. Familypedia avoids collisions between pages from different languages using the same name by requiring the addition of a language suffix to the page name that indicates language. A Full list of language suffixes allows support of over 300 languages at Familypedia. Note that Category pages must not use this convention. Due to limitations in the MediaWiki software, category structures cannot be shared between different pages, and so all multilingual content must be indicated on the same category page. Please do not use these suffixes in the Category Namespace.


The majority of early articles will support a single language. As the wiki evolves, these articles will be of interest to families whose descendants speak different languages. This project page makes recommendations for how multilingual articles should be structured and linked.

User needEdit

The typical use would be ancestors that came from another country. For example, a U.S. family wants an article about a German ancestor to be in English. Families in Switzerland may want it in German. Italian-speaking Swiss related to the family want it in Italian. Part of the problem is that all articles share the same data, so that a discovery about the month of birth of a distant ancestor should be able to be shared by all articles without the need for tedious searches for all articles mentioning the fact. This part of the solution has been discussed on Project:Info pages, but they have been superseded with the creation of sensor pages and the use of forms for data entry. The solution proposed here does not depend on Info pages or sensor pages, but is provided by Tab templates.

(Note that the "SMW" proposals further down the page will supersede parts of the first few of the following paragraphs.)


Two tiers of multilingual support for articlesEdit

Introductory tier (aka "second tier")
This level requires no additional effort by contributors and simply provides headings for tabular information in the language a visitor has chosen in their User Preferences. Since it is available only to logged-in users, and sentences in the body of the article are not translated, it will have some value to visitors, but is an incomplete solution.
Top tier
In this tier, the visitor will see a link in the upper left hand corner indicating their language name in their native script. Upon clicking the language link, they will be taken to the subject's article written entirely in the desired language. From here, other subpages may be available in the chosen language.

Top-tier multilingual article support operates independently of user preferences, so it is available to users who have not created an account, as offers search targets to google in multiple languages. Second-tier support requires the user preference for language to be set, but has the advantage that articles that have not yet been translated are still accessible for looking up genealogical data.


More detail on the structure Edit

GlossaryEdit

Details of introductory ("second tier") supportEdit

This level simply provides translations for headings in articles or for tabular information provided in templates such as Template:Showfacts children and Template:Showfacts person.

What is required for this level is that templates use a MediaWiki function (int) used for looking up text in alternative languages. The tables at Familypedia:Multilingual messages provide a list of these commonly used text messages. Contributors familiar with languages on that page are encouraged to enter the translations for the messages. (Some languages have progressed well with this.)

Creating your own templates or multilingual headingsEdit

See Familypedia:Multilingual messages for instructions on how to use these messages in place of text so that your templates and articles can provide second tier support.


Top-tier multilingual articlesEdit

This higher tier requires more effort but delivers a higher quality experience to the user.

(TODO- insert simple example with article with single title and all alternative language translations as subpages of the article. We have one on another page, for a Swedish man in French and Chinese?)

A more complex example may be found with William I, King of England (1027-1087). Every language seems to know this figure with a different name, so it does not make sense to share them under the same base article name.

Multilingual articles- example 2

Example 2: Complex case — Multilingual article with multiple language titles

Instead of storing the article text in each subpage with the ISO language name suffix, the subpage, eg William I, King of England (1027-1087)/fr, contains a redirect to the correct page, which may itself be a base page (e.g. Guillaume le Conquérant (1027-1087)). This approach is more complex, but is more natural for speakers of other languages that know the historical figure by a completely different name. The cost of this service to the user is that the contributor must do more work, creating a mirror of the redirects under original article. Further, contributors must keep these cross-language interlinks up to date. This structure in many respects is the same procedure as interwiki links in Wikipedia, though involving a little more typing. We may be able to develop bots to monitor it.

The tabs interface works because, no matter which language the current article is in, it has subpages off the current base page for each language. That is all that the tabs article requires, and this will work regardless of whether it conforms to other guidelines, such as those for sensor pages.

SMW schemeEdit

Intro?Edit

Paragraph by Phlox copied from another forum

SMW has freed us from many of the constraints of the prior multilingual approach. People can do things as I described in the multilingual article and that will continue to work and be eventually converted over by bot to the new mechanism. SMW allows us to look up the base article now, so we need not use subpages and lots of otherwise unnecessary redirects from subpages to tie articles together. The Hague / Den Haag (nl) demonstrates this. The neat thing is that it is much easier to set these up (click to edit with form on The Hague article), and add the Russian version of the Hague article. As with the top tier support I described in the multilingual article, logged out users will be able to arrive at an article in their language and read it. The UI will be in English, but we can put a banner note on these articles that explains that if they create an account they will be able to have menus and buttons in their native language. If you click on the nederlands button from Hague article, you will see that I can change the wikia UI for one article. As soon as the visitor moves away from that article though, the UI goes back to English.

Details of SMW proposalsEdit

The following is experimental, and revises the top tier scheme described above. The only articles affected are those I (Phlox) have created, so I don't expect to hear many complaints.

BackgroundEdit

For visitors to take advantage of the first tier (Mediawiki message) capability of tables, they have to be logged in, which is too high a barrier. The idea of the top tier was to use subpages, but many place names are dissimilar from each other. How does a user find "Den Haag" as "The Hague/Den Haag". Seems like they should be top level strings. However, that's a lot of names to crowd into the same namespace, and there will be collisions so what we do is postpend the language code. That means we can tell Den Haag (.de) from Den Haag (.nl).

Naming standardsEdit

So the naming standard shall be:

  • Place names are as determined by the Wikipedia for the language. The default base article is the English version.
  • Familypedia has an article named identically, and it stores the universal properties for the place: the translations to all languages, the coord, containment hierarchy, etc. The other languages must not store this data redundantly. They will store information local to that language, but properties concerning information that is true across all languages (coordinates etc) are in the English version.
  • Non-English versions must postpend their language code to the end of the articles. Placenames must be spelled exactly as they are in the Wikipedia for that language.
  • This central (English) version can easily be found by searching for the current pagename in the translated articles field. The only article where the page will appear in the English article that has that page registered in its the language version property.
  • Templates have versions with the same convention of postpended language code except they are subpages as in Commons {{nav place}} {{nav place/nl}}. Text is free of the constraints of multilingual messages. Users just translate the strings as they please and reformat tables as necessary.
    • Common names that users see might be made more friendly- eg Template:Toonfacts kinderen might be used to redirect to Toonfacts children. However, many templates depend on a common basepage structure and many have parameters that also would have to be internationalized. This could be done, but such divergence of our templates would probably consume too much time that might be wasted should we need to make global changes to all multilingual templates. Template forks with such customizations would not benefit from improvements.
  • A small text language tab bar is presented along the top of an article if any alternative language articles are available.


Technical discussionEdit

Motivation for this feature: Even modest genealogical databases run into the person name and place name variation problem due to migrations across places using different languages. EG: Ancestors: Frederik Hendrik van Oranje (1584-1647) would be recorded as it:Federico Enrico d'Orange for his Italian descendants, fr:Frédéric-Henri d'Orange-Nassau, english:Frederick Henry, or pt:Frederico-Henrique, príncipe de Orange or es:Federico Enrique de Orange-Nassau. All the same chap. How do we know if they were born in the same place at a similar time? Well, Frederick Hendrick's life events happened in The Hague, but records from Spanish descendants would list this place as es:La Haya, the Germans and Dutch call it de/nl:Den Haga, but then there is also it:L'Aia, pt:Haia. Even if familypedia had contributors in a single language, we would still require normalization of these names into a single common namespace. This is goes beyond traditional text normalization that eliminates variations due to text encoding (non unicode) or morphological variations (stemming) in a given language. Familypedia requires normalization of these proper names into a common semantic space that transcends language boundaries. All attributes on a place or person will be held in a single common place (known internally as the swmbasepage) , with all language variations stored in that same place. This scheme is universal and can also be applied to names of events, employers, etc and is limited only by the level of relevance the contributors judge are relevant. The namespaces for place and person will be normalized first.

Simple Usage example: When names of people or places are entered, autocompletion will suggest the known place and person names. This will preempt the entry of identicals, and follows the what is regarded as best practice in the knowledgebase community- to capture the greatest amount of semantics as possible from the domain expert at the time of entry. A very large multilingual knowledgebase is available: Those of the internationalized Wikipedias. For places, this requires at least one article for every locality, county, subdivision and country worldwide. A considerable namespace that will be imported via pywikipedia bot- probably focusing on European places first since this is the site of greatest occurrences of these confusions. Actual articles in other languages for the places and persons is not necessary for each language, though that will be welcome. What is important is that the language variations will be available for matching algorithms to use. The heuristics for semi-automated matching represents the more complex usage of this data.

Since this date shall be available for matching identical places and people, it is also available for presenting data to end users in their language. It would be possible to look up names in localized languages using a single inexpensive query per name. Since this is possible with a modest increase in effort, this side benefit has become a design goal of the 2009 initial creation of smw based templates.

The central goal is not to increase our appeal to global audiences, though global collaboration will be of great assistance to researchers attempting deep genealogical trees.

Around Wikia's network

Random Wiki