Ticket #299 (closed defect: fixed)

Opened 3 years ago

Last modified 2 years ago

Unicode characters destroyed by "ASCII trimming"

Reported by: anonymous Owned by: mbonetti
Priority: normal Milestone: Gregarius 0.5.5
Component: BUGS Version: 2.0
Severity: minor Keywords: unicode trimming character semantics
Cc:

Description

Character semantics should be respected when trimming strings (e.g. feed title, post title), e.g. to enhance presentation. i.e. Avoid chopping a Unicode character in two.

This problem is most obvious with CJK feeds but affects many other languages.

Example:

On the index page, links presenting previous and next posts display trimmed item titles. The trailing Unicode character is often destroyed by byte-semantic trimming. The corrupted character is displayed as a dark diamond enclosing a question mark.

Attachments

cjk_chopped_off_char.jpg (47.6 kB) - added by anonymous 3 years ago.
Screenshot of Unicode corruption due to trimming

Change History

Changed 3 years ago by anonymous

Could you please provide a feed URL that exposes this problem?

When you do so, please add "dev-feedback" to the keywords list.

Changed 3 years ago by anonymous

Screenshot of Unicode corruption due to trimming

Changed 3 years ago by anonymous

  • keywords dev-feedback added

Here's a Yahoo Taiwan news feed (UTF-8 Chinese) that should have a number of headlines of sufficient length to be trimmed. It may be necessary to browse through the per-feed item listing to see the effect.

http://tw.search.news.yahoo.com/rss?p=%AB%C8%BBy

Changed 3 years ago by anonymous

Changeset [1092] could fix this. Please let me know.

Changed 3 years ago by anonymous

  • keywords dev-feedback removed

Changed 3 years ago by sdcosta

  • status changed from new to closed
  • resolution set to fixed

Haven't heard back from the poster. Please re-open if this is still a problem.

Changed 2 years ago by kkkkoaaa

  • version set to 2.0
  • summary changed from Unicode characters destroyed by "ASCII trimming" to Unicode characters destroyed by "ASCII trimming"
Note: See TracTickets for help on using tickets.