User:MargaretRDonald/sandbox/Korean articles/Proposal for ESEAP 2024

Session 1: Korean wikidata & Mix'n'Match (workshop)

Preferably a workshop to show how to use the mix'n'match for Encyclopedia of Korean Culture (Q624626) to get started on improving the number of matches. The aim will be to add the property P9475 to Qitems. I am hoping the workshop will be bilingual with me presenting for 10 minutes in English and a Korean native speaker presenting in Korean for the same length of time.

The content of each of our presntations would be different since attacking the problem is different for Englsh speakers and Korean speakers. Participants will each add several matches during the workshop.

Since the languages for communication will be English, part of the session would deal with techniques for those using Roman scripts to tackle the language silos where most of the rest of the world cannot contribute, (Everyday perhaps 1/4 of the wikidata items I encounter cannot be accessed by anyone other than a Korean speaker: the Qitems have no "instance of" and no country; they consist solely of an item number and a Korean label linked to a kowiki page. The knowledge on the kowiki page is locked away from the rest of the world.)

A mechanism I have found useful for finding Qitems which are devoid of any wikidata is to try to match Qitems via the Encyclopedia of Korean Culture mix'n'match. (My wikidata languages are currently set at English, Korean, Japanese and Chinese. Thus, typically when I find a wikidata item without an English label I add both an English and Chinese label). I would suggest that any Korean using the mox'n'match should set their Babel preferences to at least these four languages, because the driving force behind what I am doing is to open up the knowledge locked away in kowiki (because its wikidata is lacking such things as "instance of" "subclass of", "country", "country of origin", "country of citizenship".) For cultural items I often just take the Google translation as the English label. For short items, I try to always supply a roman script transliteration as an alias.

Working on the mix'n'match not only can provide matches but by opening up the Qitem, the kowiki page when checking whether there is a match) the wikidata is seen (together with its need to be supplemented). If Koreans were to work on this, I hope that they would supply some key wikidata properties (instance of, country..) making objects, people, places, and events, accessible to the rest of the world via queries. (For example, when I realised that 성 at the end of a Hangul string implied "fortress", I added instance of fortress to all such items in my download of Qitems in openRefine. This instantly added well over 100 South Korean fortresses to wikidata, and immediately a British wikimedian, having found them, set about labelling them, while another (non-Korean speaking) wikimedian added the kowiki page's coordinates to wikidata, making them mappable. In other words, just adding "instance of" made these items acccessible to the non-Korean speaking world.)

Any Korean speaker attending the session would learn how to make their way through a mix'nmatch catalogue and would find the work considerably easier than a non-Korean speaker. However, English speakers familiar with mix'n'match would be able to assist and teach during the session because mix'n'match behaves identically across all languages.

I am using the Encyclopedia of Korean Culture catalogue as provides more matches than the various NLK catalogues which typically fail to give a match because they concern (largely) modern writers and academics usually having no wikipage but having a name in common with many other Koreans. Hence, working with these catalogues directly, is profoundly unrewarding, though potentially easier than using the Encyclopedia of Korean Culture. Hence I would not wish to do a session using them. However, If one picks one's way carefully through the Encyclopedia of Korean Culture catalogue carefully (as a non-Korean speaker), one does find many matches and many Qitems in need of supplementation.

In matching, I swap between English and Korean and back again (and back again) within the encyclopedia and also the kowiki page. I often use Google translate to give a roman script transliteration. Additionally I will sometimes use "edit the original" on the kowikipage to grab the bolded Hangul at the head of an article. If I want to add "commemorates" to a seowon, I often seize the Hangul script from encykor and copy it to a google translate window, where I can separate out the names. I then hunt the names in wikidata via the Hangul (it is extremely likely that a Korean person has no roman script label in his/her wikidata). I fix up the person's wikidata via their kowiki page adding birth and death dates as well as an English label to the wikidata, since birth & death dates together with the name are what typically identifies someone as being the same person....

As yet I have failed to encourage anyone else in this task, which is daunting with > 30,000 items needing to be amended.... And the Qitems with no wikidata at all are uncountable: they cannot be queried.

It appears that the Korean wikimedian community does not use mix'n'match (https://mix-n-match.toolforge.org/#/) See: https://mix-n-match.toolforge.org/#/catalog/4392 https://mix-n-match.toolforge.org/#/catalog/920 https://mix-n-match.toolforge.org/#/catalog/3992 https://mix-n-match.toolforge.org/#/catalog/3993 https://mix-n-match.toolforge.org/#/catalog/3994 and more.

I would really like to see the Encyclpedia of Korean culture matched to its corresponding wikidata and have been working hard to that end despite neither speaking nor reading Korean. (My Hangul is at the level of a 3 year old)

However, like most tools, whether one is reading them in Korean, Chinese, or English, positioning of items, buttons and links (and colouring of links) is identical in the tool across languages. So I do think it might be possible to show you (if you have not already used it) how to use mix'n'match and really start to link some Qitems to this information source.

link to submission of slides

Session 2:Uploading and annotating KOGL images to commons (presentation)

Uploading images from public sites is a complex task.

I will demonstrate the task of downloading a Korean public image and uploading it to commons for two Korean websites.

I will look at the licensing from two Korean sites which make images freely available, and adding all the necessary parts to permit uploading. The sites used are: Korean Cultural Heritage Administration and Encyclopedia of Korean Culture and the corresponding licence templates are {{KOGL|문화재청}} and {{KOGL|한국문화백과사전}} (Korean Open Government Licence followed by the attribution for the image).

Keep the images in a private category to permit you to find them easily and if necessary update groups of them. For example, one cannot upload an image without its date of creation date. However there were numerous Korean images I wished to upload which had no creation date. In desperation, I put 2015. When I found the template {{unknown|date}} I needed to amend a number of images. Having them in a group made this easier to do. Similarly, I made an error when I first used the KOGL template, having failed to realise that the Hangul string following KOGL was the attribution, and again I needed to amend the metadata for several image files.