MS Word cleaning and format removal
From TYPO3Wiki
<< Back to HTMLArea RTE
Word cleaning should:
Replace returns and line feeds with spaces Remove all attributes on <b>, <strong>, <i>, <em>, <p>, <li>, <ul> tags Remove style, class and align attributes on all tags Replace <em> tags with <i> tags Remove <span> tags Remove <div> tags Remove <?xml:> tags Remove <st1:> tags Remove <[a-z]:> tags Remove <!-- > tags Remove double tags Remove double spaces
HTML format removal should:
Remove abbr, acronym, b, big, cite, code, em, font, i, q, s, samp, small, span, strike, strong, sub, sup, u and var tags Remove style, class and align attributes on all tags
All HTML tags removal should:
Remove all tags Wrap the selection in <p> tags
A "RTE MSWord Cleanup" Extension (rte_mswordclean) can be found here: [1]