Unicode - 25-year update

IMDb used to have a newsletter, and the mid-April 1997 issue contained an item titled "The Great ISO Swap" reporting that IMDb had implemented the ISO 8859-1 (also known as ISO Latin-1) character set, allowing names and titles to use all the common letters with diacritical marks of the major Western European languages (such as å, ç, é, ï, ñ, ô, and ù).

http://web.archive.org/web/20060101140203/http://www.imdb.com/Newsletter/newsletter-13#iso

Near the end of the item the following statement appeared:

Ideally all data should be presented using its native character sets/ pictograms. Technically this is not possible though with current widespread software for web access, e-mail and operating systems in general.

In the future there will be a new huge standardized 16 bit character set called Unicode. It will offer the capability to freely combine Japanese Kanji with ISO 1 text and Hindi, for example. We will use it as it becomes widely available and supported by the industry.

I note that some additional character sets have been made available for the Alternate Titles section over the last few years (among them Greek, Chinese, Japanese, Korean, and Cyrillic), and I personally am not that affected by the lack of full Unicode support. However, I know that some contributors here would like to see further progress made in terms of implementing Unicode, so I am bringing this up to mark the 25 years since IMDb announced plans to implement it.

Responses

Oldest First

Selected Oldest First

vsrawat

8 Messages

•

184 Points

4 years ago

When entire world is moving away from proprietary fonts towards will Unicode, IMDb still strictly bans the use of Unicode characters

What is the reason? How does the use of Unicode character harms the website?

When will Unicode character be allowed to be used at the site?

Thanks.

Note: This comment was created from a merged conversation originally titled When will Unicode character be allowed?

J_Potier

1 Message

•

60 Points

4 years ago

A good example of the importance of full Unicode support:

The Romanian film "Față în Față" ("Face to Face") becomes "Fata în Fata" on IMDb - "The Girl in the Girl" ! The is also how "Face/Off" is known in Romania, apparently...

vsrawat

8 Messages

•

184 Points

4 years ago

Without Unicode we cannot even use Latin alphabet, and entire astronomy has major use of that. Most of stars have Latin alphabet in their names, so we cannot properly mention the star without Unicode, have to write full name of that Latin letter. There are so many astronomy related movies that would need mention of Latin alphabet.

tomas_wenigr

4 Messages

•

84 Points

1 year ago

It would be great if imdb expanded the character set for writing movie and person names. My name contains the character "š", but on the name page I have "s" instead of this character. In order for my name to be spelled correctly, the character set would have to be extended by at least "Latin Extended-A". I guess I'm not the only one who has a garbled name due to this lack.

Note: This comment was created from a merged conversation originally titled Character set extension

gromit82

Champion

•

7.9K Messages

•

281.9K Points

Tomaš: While IMDb has been planning for full implementation of Unicode, and they have added certain character sets for certain purposes, they first announced their intent to implement Unicode 26 years ago (see here). So I can't predict when the "š" in your name will be available for use.

3 years ago

adrian

Champion

•

3K Messages

•

72.5K Points

IMDb now supports these character in the attribute field, so you can enter the attribute "as Tomaš Wenigr". You can see this with some of the newer productions like The Ark

3 years ago

tomas_wenigr

4 Messages

•

84 Points

@adrian I know about this possibility, but I still find it a shame that some names are changed due to the outdated system. In addition, google has generated a knowledge panel about me that pulls information from this page and I have the wrong name in it.

3 years ago

gromit82

Champion

•

7.9K Messages

•

281.9K Points

dhtbrowne: The problem is that vowels with macrons are not supported by the main character set used by IMDb, ISO-Latin-1.

All the vowels with macrons will be available once IMDb has fully implemented Unicode. However, that is probably still quite a while away; it has been under contemplation since 1997. (See https://web.archive.org/web/20060101140203/http://www.imdb.com/Newsletter/newsletter-13#iso.)

2 years ago

AdrianTofei

42 Messages

•

806 Points

Hello,

My name is Adrian Țofei, but because the IMDb platform doesn't accept the accented character "Ț", I could only be listed as Adrian Tofei. Please add that character so that my name can be displayed correctly on IMDb, as written on my movie's credits, website, social media and everywhere else.

You can find more info on Wikipedia about the Romanian letter Ț. There are many other Romanian actors on IMDb whose names are not correctly spelled because of this issue.

Thanks a lot!

Note: This comment was created from a merged conversation originally titled Please add the accented letter Ț

Filmmaker/Actor

1 year ago

Revolt

3 Messages

•

90 Points

In the Romanian language, we have the A-breve letter "ă" (https://en.wikipedia.org/wiki/%C4%82), but, unfortunately, IMDB does not allow it:

The Unicode character at code point 259 [ă] is not supported.

Since we should be able to use this common letter for Romanian titles, please add support for it.

Note: This comment was created from a merged conversation originally titled Make the website inclusive for the Romanian language

1 year ago

dhtbrowne

3 Messages

•

70 Points

We're based in New Zealand and the Māori language uses macrons over letters to denote a longer vowel sound. At the moment in the list of special characters the closest thing I can find is the tilda (wiggly line above a letter) but that's not technically correct and it should be a macron. I would like to suggest that special vowels with macrons be added to the special characters list. In Te Reo Māori (the Māori language) a macron can change the meaning of a word. For example from the film industry - the famous wētā workshop should be spelt with macrons over the e and the a, and is the name of an insect. Without the macrons weta means excrement. Quite the different meaning from two little lines!

Note: This comment was created from a merged conversation originally titled Macrons to be added to special characters

1 year ago

markus_7164959

2 Messages

•

70 Points

1 year ago

It isn't fair that other language's alphabet characters are accepted for the Alternative Titles but not Latvian. These are the Latvian alphabet characters that weren't accepted for the Alternative Titles. Ā, Č, Ē, Ģ, Ī, Ķ, Ļ , Ņ, Š, Ū, Ž, ā , č, ē, ģ, ī, ķ, ļ, š, ū, ž. IMDb please accept all Latvian alphabet characters for the Alternative Titles.

Note: This comment was created from a merged conversation originally titled Accept all Latvian Alphabet characters for the Alternative Titles

davidah_ca

Champion

•

1.9K Messages

•

92.6K Points

Unfortunately this is not a simple change.

The problem is that IMDb was begun very early in the internet period, when 8-bit data was fairly new. IMDb was designed to support only the Latin-1 code page, which does not include all the characters required for non Western European languages.

Because the current update system has become very complex, simply switching to a scheme that supports 16-bit code (like UTF-8) was considered too likely to cause problems. Some of the high-level extended ASCII characters were (are) used as control characters, and it is possible that a Unicode character could break the update routines. Moving to Unicode would require a complete review of the entire software.

As you may know, IMDb is currently rewriting the update system. I assume that they are assuring that it will be able to support Unicode characters. Therefore I doubt that there will be any change to the titles until after that change is complete.

11 years ago

hank_slater

2 Messages

•

70 Points

1 year ago

I am working on a multicultural project that features characters fluent in foreign languages and who speak those languages in the TV series. As this I feel, as well as the diversity in authority figures in the show, makes it a concept with a high likelihood for international popularity. I tried to list the alternative titles and I was unable to as only the Western and Cyrillic Alphabets are supported on IMDb. I also think that allowing talent to use their native languages for their names or AKA blurbs I think would help increase world use of the site. I have had difficulties casting Asians fluent in their native languages, I suspect because those who don't speak fluent English don't use IMDb, making casting for Asian roles, where fluent English is not necessary difficult. I think IMDb should be more true to International Movie Database and support international Alphabets. What do you think?

Note: This comment was created from a merged conversation originally titled International Text. I think all text types should be supported.

silverbacknet

4 Messages

•

150 Points

1 year ago

Since 2014, the messageboards have supported the full Unicode set, so it obviously isn't that hard. Why doesn't the main site support Unicode? Right now it's still mired in a Western European-centric interface, where all names have to be transliterated to be posted, despite the transliterations being debatable and unofficial. The submission form even recognizes individual Unicode symbols, but specifically disallows them!

Note: This comment was created from a merged conversation originally titled When will IMDB support full Unicode?

MAthePA

2K Messages

•

57.7K Points

Admins, please include the votes and merge this thread into:
https://getsatisfaction.com/imdb/topics/support_for_unicode

7 years ago

taewong

12 Messages

•

370 Points

1 year ago

Unicode is not fully supported in IMDb. For example, in Polish: you could change all references by searching “milosc” and then changing them to “miłość”. And Jiří Hnídek is written without an r-hacek on the start of their first name. It can also do the same for the ILM person Coşku Özdemır which is an Turkish person listed on Cinefex.

Note: This comment was created from a merged conversation originally titled Support for Unicode.

randy_532956

2 Messages

•

100 Points

I agree. This affects titles, names, characters, discussion, and probably more.

Where I run into the problem most is in the discussion forums. If you paste non-ASCII characters copied from somewhere else (for example, to show a symbol that was in the film, or indeed to show the native-language title of the film), they just get turned into what appears to be the HTML text code for those characters, instead of the symbol itself.

It's 2013. This shouldn't be happening.

13 years ago

dan_dassow

Champion

•

20.7K Messages

•

490.4K Points

Since this is the International Movie Data Base, it is truly surprising that they do not support unicode.

● FAQ: Key Threads - IMDb Poll FAQs Index

● FAQ: Summary Statistics and Poll Index

● BonaFideBoss' IMDbStats

● FAQ: Updating Threads After Poll Goes Live

Follow the IMDb Polls in Facebook and Twitter

13 years ago

taewong

12 Messages

•

370 Points

Since MobyGames supports Unicode, macrons in Japanese are OK for long vowels. Note that the title ends with a punctuation mark (full stop). Hungarian, Czech, Polish, Romanian, Slovak etc. requires a bunch of accented letters.

13 years ago

bluesmansf

Champion

•

4.6K Messages

•

236.3K Points

Except that it's "Internet Movie Database," not "International." ;)

13 years ago

dan_dassow

Champion

•

20.7K Messages

•

490.4K Points

This must be a Freudian slip. [wink]
Reminder to self: Don't post when tired.

● FAQ: Key Threads - IMDb Poll FAQs Index

● FAQ: Summary Statistics and Poll Index

● BonaFideBoss' IMDbStats

● FAQ: Updating Threads After Poll Goes Live

Follow the IMDb Polls in Facebook and Twitter

13 years ago

bluesmansf

Champion

•

4.6K Messages

•

236.3K Points

LOL. Too be honest, though, it's almost like they want it to be known that way. Most mentions of the spelled-out name are gone. Kind of like Kentucky Fried Chicken is only KFC now. New visitors seem to be having a hard time figuring out what the site is...video streaming, file sharing?

13 years ago

mightyemperor

Champion

•

1.9K Messages

•

146.1K Points

Or after taking some random prescription medication you found lying in the meep.

13 years ago

taewong

12 Messages

•

370 Points

Yeah. Do not post nonsense. You have accidentally removed a comment (you need to dispute this remove).

13 years ago

dan_dassow

Champion

•

20.7K Messages

•

490.4K Points

It is almost like Randall Munroe has been reading this forum.
http://xkcd.com/1209/

● FAQ: Key Threads - IMDb Poll FAQs Index

● FAQ: Summary Statistics and Poll Index

● BonaFideBoss' IMDbStats

● FAQ: Updating Threads After Poll Goes Live

Follow the IMDb Polls in Facebook and Twitter

13 years ago

taewong

12 Messages

•

370 Points

You quote the comic: “The Skywriter we hired has terrible Unicode support.”

After correcting Miroslav Kure's suname to Miroslav Kuře (to match Czech support: the Danish/Faroese/Norwegian ø is rcaron) in Battle for Wesnoth 1.11.1 contribution community, you have many problems with the Internet Archive Wayback Machine this time. First the connection is too slow to load and you get the error mesage “The machine that serves this file is down. We're working on it.” twice. Unicode in their own forum affects subjects (titles) and more. Note that the thread has nonsense!

13 years ago

davidah_ca

Champion

•

1.9K Messages

•

92.6K Points

This has been mentioned many times over the past few years. A bit of history may help here.

When IMDb first started, it was updated by an automated email system. This was at a time when some of the email routers still only handled 7-bit ASCII and special encoding was needed to ensure that 8-bit codes would not be trashed. Moreover, some characters (e.g. | the 'pipe') were used internally (and in the email) as controls/delimiters. This is why you may sometimes see older contributors indicate a credit update as :

John Doe | 2nd Pirate | 22

By the time Unicode became standard, the system had grown quite complex. Before Unicode can be implemented, every part of the system needs to be checked and potentially modified to ensure that it will not be broken by any of the Unicode codes.

IMDb is currently in the process of moving the various lists (sections) to new internal systems. I hope and expect that they are designing these systems so that they will be able to support Unicode.

Once the moves have been completed, we may see support for Unicode, but don't expect it any time soon.

13 years ago

taewong

12 Messages

•

370 Points

You will need an answer. You have removed the first reply by accident. Where a name includes a suffix, we use a comma to separate it from the name. On game credits and indexes it is not treated as an integral part of the surname. Examples are:

Hernandez, Jonathan, Jr
Rowe, William A., Jr.
Tibbetts, Richard S., III

It thinks that the Get Satisfaction software uses Unicode. It supports different accented characters for Eastern European languages.

13 years ago

bluesmansf

Champion

•

4.6K Messages

•

236.3K Points

The change log says you removed it...??? What the..??

13 years ago

taewong

12 Messages

•

370 Points

This reply was removed on 2013-03-25.

13 years ago

bluesmansf

Champion

•

4.6K Messages

•

236.3K Points

Yep. And:

3 months ago
taewong, the poster:
Removed a reply in this topic
Reason: removed by the poster

13 years ago

johndeer

2 Messages

•

60 Points

Actually, it seems that after the message-board makeover, Unicode support is even worse! At least with the old ones you could enter most extended ASCII glyphs (assuming proper code-page is set). But now anything that is above 127 doesn’t work.

13 years ago

_imon_falko

2 Messages

•

82 Points

It's year 2014 and some Czech characters are still not supported.

12 years ago

spyros_6842303

12 Messages

•

160 Points

It's almost 2015 and Greek characters aren't supported AT ALL.

12 years ago

sorin_1301777

9 Messages

•

116 Points

This reply was created from a merged topic originally titled
How many years will it take you to understand UNICODE?.

In 2009, in Contact #3034383 (http://www.imdb.com/helpdesk/thread?tid=3034383) you the owner of IMDB promised professional usage of UNCODE "in a little while". It is now 5 years and a half later and your web site is still crippled with no UNICODE implementation. 5 years and a half??? Don't you fill embarrassed with your "professionalism"? Shall we wait another 5 years for IMDB to understand the word "international"?

(This post is addressed solely and specifically to IMDb staff.)

12 years ago

Murray

Employee

•

18 Messages

•

2.9K Points

We are making slow and steady progress on Unicode support. Note that until every single part of a system supports Unicode, none of it works. We have a lot of critical backend systems that need to be migrated. Unfortunately, we don't have a timetable that we can share, but please be aware that we are working on it.

Note that in the last few weeks we've enabled full Unicode support in the message boards:

http://www.imdb.com/board/bd0000043/nest/235469052

We had a number of encoding issues that I believe we have fixed.

12 years ago

Murray

Employee

•

18 Messages

•

2.9K Points

Note that user reviews:

http://www.imdb.com/user/ur2278015/

...and lists:

http://www.imdb.com/list/ls001825868/

...also support Unicode.

12 years ago

spyros_6842303

12 Messages

•

160 Points

Yes, but no movie display titles...

12 years ago

Murray

Employee

•

18 Messages

•

2.9K Points

There is already limited support for this; see the Greek title here:

http://www.imdb.com/title/tt0015648/releaseinfo#akas

Our systems currently use a mixture of ISO-8859-1, UTF-8, and KOI8-R. Untangling this mess while keeping things running is like changing the fan belt on an engine without switching it off.

12 years ago

spyros_6842303

12 Messages

•

160 Points

I tried to add a title in a movie but the system didn't let me. It errored in every letter i entered.

12 years ago

Murray

Employee

•

18 Messages

•

2.9K Points

Yup. The submissions pipeline doesn't yet handle Unicode.

12 years ago

spyros_6842303

12 Messages

•

160 Points

So, the movie titles written with Greek characters are made by the people inside?

Is there a timeline when I will be able to contribute Greek titles?

12 years ago

Murray

Employee

•

18 Messages

•

2.9K Points

Yes, there were some cases added manually years ago.

We don't have a timeline yet, but we know people really want it.

12 years ago

piotr_balwierz

2 Messages

•

64 Points

3 years has passed and IMDB is still mentally in the pre-unicode 1990's.

If you don't want to fix your database for unicode support, then just write parsers and translate user input to html codes.
Moreover, some html codes are not supported, eg. &nacute;

NB. It is not possible to have a title with a non-basic-latin character. Even if I fix a movie and input a html the form will on the fly change it to unicode and report a problem (!!)

10 years ago

Marco

3.4K Messages

•

94.2K Points

The last update on this (at least in this thread) was two years ago, so can a staffer tell us what has happened these past two years regarding this issue?
(I note that in the message boards on IMDb, one could see exactly when a post was made, here I can only see that Murray responded two years ago, not very specific).

9 years ago

gromit82

Champion

•

7.9K Messages

•

281.9K Points

Marco: In response to your latter comment, you can see the exact time of a post here, at least on the desktop version of GetSatisfaction. To do that, hover your mouse over the time designation of the post (such as "2 years ago"). So, for example, Murray's post that begins "Yes, there were some cases added manually years ago" was posted October 9, 2014 at 10:46:58 PM UTC.

I don't know whether or how it is possible to see the exact date and time on the mobile version of GetSatisfaction.

9 years ago

Murray

Employee

•

18 Messages

•

2.9K Points

Checking in to say that we're still working on it, but at this point can't commit to a timeline.

9 years ago

Marco

3.4K Messages

•

94.2K Points

Thanks Gromit!
Is there also a way I could've replied this post to you instead of to myself that I haven't found?

9 years ago

Marco

3.4K Messages

•

94.2K Points

Thanks for letting us know you're still working on it.

9 years ago

sorin_1301777

9 Messages

•

116 Points

Come on! If you haven't done much in 7 years, the timeline is clear: for ever! :)

9 years ago

sorin_1301777

9 Messages

•

116 Points

Correction: "don't you FEEL"

9 years ago

jeorj_euler

10.7K Messages

•

226.4K Points

What about XML character entity references?

— Jeorj Euler, an IMDb regular registrant

9 years ago

owenrees

523 Messages

•

15.3K Points

SGML/HTML/XML character references are no more useful in solving the underlying problem of representing and processing the full range of Unicode than any of a number of other encodings. They make sense if the data is represented and processed in XML - perhaps using technology such as XSLT - but even then they would appear only in externalised forms emitted as output or accepted as input. Since XML is, for preference, represented in UTF-8 in externalised forms, using character references does not give much benefit.

Using SGML character references in internal representations would cause all sorts of problems, especially with searching and matching.

9 years ago

jeorj_euler

10.7K Messages

•

226.4K Points

I see that the IMDb staff has left this proposal in an "under consideration" state. Very interesting.

I shall opine that it is not so challenging for search algorithms to be made to account for strings encoded with standard character entity references, and it would be a shame if most of the libraries and engines behind most search tools used deployed in any electronic database anywhere throughout the World-Wide Web lacked such a capability. But likewise, the same could be said of Unicode deployment, or that of Internet Protocol v6 for that matter.

— Jeorj Euler, an IMDb regular registrant

9 years ago

owenrees

523 Messages

•

15.3K Points

According to https://en.wikipedia.org/wiki/SGML_entity#Character_entities - and I have no reason to doubt its accuracy -

HTML 4, for example, has 252 built-in character entities that don't have to be explicitly declared. XML has five. XHTML has the same five as XML, but if its DTDs are explicitly used, then it has 253 (' being the extra entity beyond those in HTML 4).

This calls into doubt the concept of "standard character entity references" and also makes it clear that the sets of character entities that can be considered to be in common use do not cover the range of Unicode codepoints.

If we allow numeric character references - both decimal and hexadecimal - then each Unicode codepoint in the data can be represented in three or four ways in any system that can handle Unicode. The only rational way to deal with that complexity is to decode the data to strings of Unicode codepoints before applying normalisation and then using it in whatever processing is required. Having decoded the data to Unicode codepoints, the simplest and most widely supported encoding to use for any sort of I/O is UTF-8. Unless the data is being embedded in some SGML-like format such as XML, there is no reason to use character references and there is never a reason to use references for characters that do not have specific meanings in the markup if the underlying representation can support Unicode.

The most fundamental requirement in handling character encodings is to be obsessively strict in tracking how each piece of data is encoded. In general, data may have multiple layers of encodings and it is essential to keep track of which have been applied to each piece of data. Each additional kind of encoding adds complexity, especially if it can be layered on other encodings, so the goal should always be to use as few encodings as possible.

I expect that some filmmaker will want to capture the essence of the World Wide Web and will decide to use a title such as "Markup: < < & & changed the world" and whatever encodings are used by IMDb had better be able to cope with that. (and I hope that this forum can handle it too!).

9 years ago

hessu_hopo

5 Messages

•

358 Points

How is this still a thing.

8 years ago

leodevbro

5 Messages

•

248 Points

When the time comes, please don't forget Georgian language characters to be in the supported characters list.

7 years ago

Murray

Employee

•

18 Messages

•

2.9K Points

There's a Unicode block for Georgian characters, so they will be supported automatically. Whether or not the characters display properly in browsers will depend on whether people have a font installed locally.... but presumably those who are interested in Georgian characters will!

7 years ago

kaveh_azadi

5 Messages

•

204 Points

When will Unicode be fully supported in text fields in IMDB? If this website is really Internet Movie DataBase, it's supposed to support non-English languages, and how come in 2019 your website doesn't support unicode, it's a shame.

7 years ago

justin_eberlein

3 Messages

•

132 Points

Adding to this from 2020 and Quarantine Land: Discovered this after trying to correct Abed's Polish from S01E08 of Community from "Czesc" to "Cześć". It's 2020. If your site is supposed to be international, it really should support Unicode. Although I'll at least grant that it's *uniform* in not allowing non-ASCII, as opposed to the weird trend elsewhere on the internet of only bothering with diacritics if it's a Western European language.

6 years ago

piznajko

172 Messages

•

5.1K Points

Justin Eberlein, IMDb started allowing Unicode in 2019, but it's only allowed in 'Alternative titles' field, not in the 'Original title' field.

6 years ago

sorin_1301777

9 Messages

•

116 Points

They are really dumb! :D I don't think there's another web site with this huge flaw. There is no other web site struggling so hard to get international! :D

6 years ago

English_pedantic_grammarian

84 Messages

•

1.6K Points

The main reason IMDb doesn't support Unicode throughout is that it was founded in 1990. Changing that isn't a case of pressing a button and it works, as there is so much data entered in the current system, which is built with the presumption of the non-existence of Unicode, and new data coming in all the time. If you want a database of movies and TV that supports Unicode throughout, you can either complain here (to no effect whatsoever), or use one founded in 2008.

5 years ago

Daffodils

2 Messages

•

60 Points

Any news on unicode support? It's 2022 and it's absolutely shameful that IMDB doesn't support unicode characters. So many foreign names on here are misspelt because of it. Unicode characters aren't just aesthetic quirks - they change the meaning of words.

4 years ago

Maatamun_0303

12 Messages

•

310 Points

1 year ago

Dear IMDb.

I am writing to share some observations and suggestions regarding the handling of names on your platform. I have noticed that there are some challenges with the correct representations of names, especially when it comes to diacritical marks and non-Latin writing systems.

For example, if one searches for a person named "Lasse Kvelnes", they are directed to a profile named Lasse Kvalsnes and his real name is only an alternative name but in reality he doesn't seem to have an alternative name. Maybe it would be a good idea to only use "alternative names" for nicknames or abbreviated names. Otherwise this can be confusing and potentially lead to errors. "Zdena Pelikanova" is a twin profile to "Zdenka Pelikanova" using one of her nicknames and according to Czech orthography the name should be written "Zdeňka Pelikánová". The profile "Zdenka Pelikonova" would represent a pronounciation error if it could speak and present itself. This highlights the need for IMDb to upgrade the system to accept the full range of Unicode characters. Moreover, it is difficult for users who are searching for individuals with names written in non-Latin scripts such as Hangul.

I understand that there may be technical limitations that make it difficult to implement full Unicode support. However, I believe that this could be a valuable improvement to your platform, making it more inclusive and user-friendly for an international audience.

I look forward to hearing your thoughts on this suggestion.

Best regards Maatamun

Note: This comment was created from a merged conversation originally titled Handling of names on your platform

owenrees

523 Messages

•

15.3K Points

There have been various posts on better support for Unicode and I think it might be better to consolidate the support into many votes on one idea rather than spreading them across many.

3 years ago

owenrees

523 Messages

•

15.3K Points

The thread I cited has been merged into this one and it look as if that does not carry over the votes. This thread has 17 votes as I am posting this so we have lost at least 37 votes unless people who had voted for it had removed their votes.

This is unfortunate if the number of votes is being used as a measure of support for an idea.

1 year ago

Marco

3.4K Messages

•

94.2K Points

@owenrees I seem to recall the issues of votes not being carried over has been raised before but nothing has been done about it, but I can't find the thread about it...

1 year ago

linyou322

24 Messages

•

724 Points

1 year ago

Suggestion: The title can support the two tone letters ā and ē.

Note: This comment was created from a merged conversation originally titled 2024-12-06 Suggestion

MartinK75

86 Messages

•

1.4K Points

1 year ago

I'd like to use the following accented character in a submission: š but it's not on your list of approved accents - can a solution be found?

Note: This comment was created from a merged conversation originally titled Accented character

Maya

Employee

•

7.5K Messages

•

79.1K Points

Hi MartinK75-

Thank you for reporting this issue. I've forwarded this information to the appropriate team for further reviewing (Ref Ticket #V1621449887). We'll reply once we receive further information.

Cheers!

1 year ago

Michelle

Employee

•

18.3K Messages

•

322.5K Points

Hi @MartinK75 -

Unfortunately, at this time IMDb only accepts non-latin 1 characters in a limited set of places, specifically in title AKAs.

1 year ago

passport78556

2 Messages

•

70 Points

11 months ago

IMDb currently does not support several essential Turkish characters—namely Ş, ş, ğ, and ı—in movie titles, contributor names, and user reviews. This limitation leads to incorrect representation of Turkish content. For example, the film “Kış Uykusu” (Winter Sleep) is forced to appear as “Kis Uykusu,” and names like “Yağmur” become “Yagmur,” which distorts the original language and meaning. It’s important to note that other Turkish Latin-based characters such as Ç, ç, Ö, ö, Ü, and ü are already accepted and display correctly on IMDb. The exclusion of Ş, ş, ğ, and ı appears to be a technical oversight, as all these characters are part of the standard Unicode Latin Extended-A block and are supported by modern platforms and databases. Why does this matter? Accurate language representation is essential for cultural respect and user trust. Contributors and audiences deserve to see correct titles and names, as they appear in the original language. IMDb’s global reputation relies on inclusivity and technical excellence. Recommended Actions: Update IMDb’s character validation to accept Ş, ş, ğ, and ı in all relevant fields. Retroactively correct existing titles and names to restore their original spelling. Ensure these characters are supported across all IMDb platforms (web, mobile, API). Supporting these characters is not just a technical fix—it’s a step toward respecting linguistic diversity and improving the user experience for millions of Turkish speakers and international audiences. For reference, see how other official platforms handle language support and user accessibility, such as https://passportstatuscheck.pk/, which offers multilingual support and ensures all international characters are displayed correctly. Thank you for considering this important update to make IMDb more inclusive and accurate for all users. Note: This comment was created from a merged conversation Link : https://community-imdb.sprinklr.com/conversations/data-issues-policy-discussions/request-for-imdb-to-support-turkish-characters-s-s-g/6842062633343b5c156304e3 Title : Request for IMDb to Support Turkish Characters (Ş, ş, ğ, ı)

Mark13

Employee

•

145 Messages

•

1.5K Points

Thanks passport78556, We appreciate you explaining how the current character limitations affect both fans and professionals. Your suggested solutions are helpful. I'll work with our team to explore ways we can better support Turkish language characters in the near term. Thanks, Mark

11 months ago

velikural

8 Messages

•

306 Points

There may be tens of thousands of international movies/shows where the contributors are unable to use some of the characters from the original language. The ones I know are some of the characters in Turkish, like Ş, ş, ğ, and ı, which are actually derived from the Latin alphabet. I am quite sure that this can be solved, because the other different Turkish characters (again derived from the Latin alphabet), such as Ç, ç, Ö, ö, Ü and ü are accepted and they appear correctly on IMDb. Also, when submitting a user review (which involves the original title), I received this warning: “Sorry, your submission contains the following invalid characters: Ş, ğ, ı, ş. Please correct them and resubmit.” In short, many of the original titles and names (and reviews which include them) has these characters missing or they appear incorrect. Is it possible to introduce these characters to the system of IMDb (at least these four characters mentioned above)? Note: This comment was created from a merged conversation Link : https://community-imdb.sprinklr.com/conversations/imdbcom/problem-of-being-unable-to-use-some-of-the-international-characters/6841fd2c13f03f4b7e1de3d0 Title : Problem of being unable to use some of the international characters

11 months ago

owenrees

523 Messages

•

15.3K Points

See https://community-imdb.sprinklr.com/conversations/data-issues-policy-discussions/unicode-25year-update/625f7dbf61c81e7c7cd37277 for some of the history.

11 months ago

velikural

8 Messages

•

306 Points

Thank you for your interest in this important issue. I checked it more thoroughly and noticed that we forgot these two capital letters: İ and Ğ. So, the complete set of Turkish characters missing on IMDb is: Ğ, ğ, ı, İ, Ş and ş (6 characters in total).

11 months ago

devdarianika

2 Messages

•

72 Points

4 months ago

Dear IMDb Support Team, I am writing to bring your attention to a technical limitation regarding the submission of movie titles in the Georgian language. Currently, when attempting to add a Georgian title in the "Also Known As" (AKA) or "Release Info" section using the native Georgian script (Mkhedruli), the submission system rejects it. I specifically encountered this issue while trying to add a title, where I received the following error: "The character [მ] (U+10DB - GEORGIAN LETTER MAN) is not supported here." The system continues to flag every subsequent Georgian character as unsupported. Georgia has a rich cinematic history and an active modern film industry. The inability to list titles in their original script on a global platform like IMDb is a significant drawback for local users, researchers, and film enthusiasts. In an era where Unicode is the global standard, I kindly request that you update the submission forms to support the Georgian alphabet. This will allow contributors to provide more accurate and localized data for your database. Thank you for your time and for maintaining the world's most important film database. Best regards, Nika Devdariani Note: This comment was created from a merged conversation Link : https://community-imdb.sprinklr.com/conversations/data-issues-policy-discussions/support-for-georgian-unicode-mkhedruli-script-in-title-submissions/695bd9b41d59574bad161705 Title : Support for Georgian Unicode (Mkhedruli script) in Title Submissions

geeked_out_4_movies

177 Messages

•

3.3K Points

2 months ago

Hi can you please allow Hebrew / Arabic / Persian alternate titles? Note: This comment was created from a merged conversation Link : https://community-imdb.sprinklr.com/conversations/data-issues-policy-discussions/please-allow-hebrew-arabic-persen-alternate-titles/69963cb31c05121895b594d7 Title : Please allow Hebrew / Arabic / Persen alternate titles