This is the
talk page for discussing improvements to the
Code page article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google ( books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1 |
This article is rated C-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||||||||
|
This article was nominated for merging with Character encoding on February 27, 2014. The result of the discussion was to keep the articles separate. |
|
|
After reading the article, I failed to understand how are code page and encoding different, which is claimed by the article. -- Voidvector ( talk) 13:49, 31 August 2008 (UTC)
Reply: think of a codepage as a list of characters, and an encoding as a way that the characters are stored.
For instance, the Unicode character set has a trademark symbol at position 8482 (2122 hex).
So the codepage simply says: 8482 -> TM.
Now if this is encoded as UTF-32, this is a 32-bit word with value 8482. If it's encoded as UTF-16LE, it would be two bytes with values 34 and 33.
8-bit codepages don't have different encodings: a byte is a byte. So if a codepage has a TM at position 153 for instance, that means the encoding is the value 153 for that character. So the encoding matches the codepage listing byte for byte.
Pim 2 (
talk) 15:00, 22 May 2009 (UTC)
I totally agree with Voidvector, the difference between a code page and a character encoding is still not clear, even with Pim 2 explanation. The "character encoding" definition is any number of pairs { character + code }, thus it contradicts Pim 2 "the encoding matches the codepage listing byte for byte". Thus code page = character encoding, just the name is different Sandrarossi ( talk) 10:17, 6 August 2009 (UTC)
MIK is almost certainly Code Page 879. ISO 8859-11 is almost certainly Code Page 873.
The Spanish code page 854 is not from IBM, but what was the code page layout? IBM's code page 854 was probably DOS Latin 4, continuing the sequence created by code pages 852, 853, and 855. Alexlatham96 ( talk) 20:23, 12 May 2020 (UTC)
Given the decision at Wikipedia:Articles for deletion/Code page 875 to move nearly all articles on EBCDIC code pages to Wikibooks, are there other articles linked from this page that should be moved as well? -- Beland ( talk) 17:05, 20 July 2020 (UTC)
In the Microsoft part, it says:
1200 – UTF-16LE Unicode (little-endian)
1201 – UTF-16BE Unicode (big-endian)
In the IBM part, it says:
1200 – UTF-16BE Unicode (big-endian) with PUA
1201 – UTF-16BE Unicode (big-endian)
1202 – UTF-16LE Unicode (little-endian) with IBM PUA
1203 – UTF-16LE Unicode (little-endian)
Making a clear anti definition with BE and LE conflicting around 1200 / 1201.
So, what is this mess? 77.159.196.124 ( talk) 13:41, 29 August 2022 (UTC)
Where can I find the full IBM PUA mapping? For example code page 1056 has many PUA characters. Alexlatham96 ( talk) 03:17, 1 May 2023 (UTC)
Until somebody can come up with a specific reference to this manual, this should be regarded as apocryphal. Note discussion at https://retrocomputing.stackexchange.com/questions/14780/is-the-ibm-standard-character-set-manual-around MarkMLl ( talk) 20:59, 31 December 2023 (UTC)