------------------------------------------------ Fixing text damaged by Microsoft-only characters ------------------------------------------------ - Ian! D. Allen - idallen@idallen.ca - www.idallen.com Composing offline is good - you always want to save a copy of anything you paste into a web form. My research says you'll find that Word will still save a "plain text" document with bad punctuation characters in it. (This is part of Microsoft's scheme to make you have to use Windows for reading everything, otherwise it "doesn't look right". Microsoft wants you to think that the problem is that you use a Mac or Linux, not that they saved the file in Microsoft-only format. Microsoft does not play well with others.) There isn't an easy solution to seeing these little buggers while you're still logged in to a Windows machine - I think all the Microsoft programs show the bad characters in the Microsoft-friendly way. Some say that if you switch to Courier font, the curvy "smart quotes" are easier to spot; but, removing them is still going to be necessary. If you can stand actually *doing* your writing in something simple like Notepad, Notepad doesn't insert the bad characters on its own like Word does. A document typed into Notepad from scratch (no cut-and-paste from Word!) should be "plain text" for the most part. (I don't actually know for sure, since I don't use Windows, and none of this is documented.) Some people say that saving Word as plain text, then opening the saved text in Notepad, then cutting and pasting from Notepad, gives plain text. Others disagree and say that doesn't work. You likely won't be able to tell as long as you look at the text on a Windows machine. I did some Google searches that may be helpful or at least exhausting. I even found a web form that will convert at least some of the Microsoft badness for you (first link) but I don't know if it does *all* badness: http://jon.hedley.net/convert-ms-word-to-plain-text "Convert MS Word To Plain Text" - a little web text box that turns Microsoft-speak into plain text for cutting-and-pasting! http://tychousa1.umuc.edu/wtdocs/wthelp/html/smartquotes.html - how to disable smart quotes and hyphens - "Quotes and hyphens must be manually removed from existing documents." http://www.thuto.org/ubh/ub/compu/hcsmsw1.htm - how to switch off bunches of mis-features in Word http://www.rileyguide.com/eresume.html#prep - claims that Auto Format can remove smart quotes - but will it also remove smart hyphens and smart ellipsis? http://internetwritingworkshop.org/formatting.shtml "Turning off smart quotes, dashes, and ellipses does not remove existing examples in your text. They can be changed manually, or you can use Find and Replace. The question the becomes, what do you put in the Find line? The answer is, go into your text, find an example of a unwanted character, select it, copy it, and paste it into the Find line. Type its replacement in the Replace line, and click on Replace All. http://www.gutenberg.org/wiki/Gutenberg:Word_Processor_FAQ "switch off any feature that changes what you type without asking you. http://www.garypresley.net/NFICTION/Nfconfigureemail.htm "FORMATTING EMAIL FOR THE INTERNET WRITING WORKSHOP" - I'm not sure I'd follow the advice of anyone writing in upper-case http://www.strangehorizons.com/guidelines/fiction-formatting-detail.shtml - an exhaustive description of how you could search-and-replace away all the bad characters and make plain ASCII text (with ASCII-like mark-up for italic and bold too, if you want). http://wordpress.org/support/topic/87387 "After reading the forums I found many people who have fallen into the same trap of cutting and pasting from MS Word. This really causes problems. I haven't seen any warning signs about this (maybe I am not looking in the right places). Anyway, if a moderator reads this, it would be nice to have a sticky in big bold letters warning people about this, and, of course, other dangerous no no's. I did read one quite humorous post about it, but I only found that thread after I had already started pulling my hair out. Anyway, thanks everyone. I am now not going to use the rich text editor anymore (not that that is a problem - it is just easier to see problem code using the basic editor), and I definitely will not go anywhere near MS Word. I do want to keep the rest of my hair. And the final comment: http://everything2.com/index.pl?node_id=75854 "Smart Quotes: A Microsoft Windows innovation© and deviation from standard character sets. Smartquotes inserts nonstandard quote characters that fail to render on non windows platforms (e.g. Unix, Linux). The result is nonstandard Microsoft specific HTML that renders like this: I picked up the dog?s food bowl. It?s strange how delicious the dog?s food looked. But it?s the dog?s food, not mine, so I left it alone. [...] A little detective work revealed that, as is usually the case when you encounter something shoddy in the vicinity of a computer, Microsoft incompetence and gratuitous incompatibility were to blame. Western language HTML documents are written in the ISO 8859-1 Latin-1 character set, with a specified set of escapes for special characters. Blithely ignoring this prescription, as usual, Microsoft use their own "extension" to Latin-1, in which a variety of characters which do not appear in Latin-1 are inserted in the range 0x82 through 0x95--this having the merit of being incompatible with both Latin-1 and Unicode, which reserve this region for additional control characters. Of course, Microsoft would say the fix is for all the holdouts to switch to windows. Resistance is futile. Resistance is futile. It's their marketing against piddly little things such as international standards. -- | Ian! D. Allen - idallen@idallen.ca - Ottawa, Ontario, Canada | Home Page: http://www.idallen.com/ - Contact Improv: http://contactimprov.ca/ | College professor (Open Source / Linux) via: http://teaching.idallen.com/ | Support the public commons and public digital rights: http://eff.org/