UnicodeConverter icon

UnicodeConverter For Linux

3.0/5 22
GPL    

UnicodeConverter is a Java program that converts text and HTML files in ISC, TCVN3 (ABC), VISCII, VNI, and VPS format to Unicode. #Unicode converter  #RTF converter  #HTML converter  #UnicodeConverter  #Unicode  #Converter  

Description

Free Download

UnicodeConverter is a Java program that converts text and HTML files in ISC, TCVN3 (ABC), VISCII, VNI, and VPS format to Unicode UTF-8. Conversion support for Unicode Composite, Numeric Character References (NCR), and VIQR (Vietnet) is also included. In all cases, the output will be in Unicode Normalization Form C, or better known as Unicode Precomposed format.

UnicodeConverter, executable in both graphic user interface (GUI) and command-line modes, is capable of converting multiple files in a directory, or an entire directory, including its subdirectories. In effect, this enhanced capability enables conversion of an entire website to Unicode UTF-8 format with one single command or a few mouse clicks. Drag-and-Drop support is also included.

Support for conversion of Word documents and Excel workbooks on the Windows platform is included. This feature is implemented using JACOB, a Java-COM Bridge that allows clients to call COM Automation components from Java. JACOB uses Java Native Interface (JNI) to make native calls into the COM and Win32 libraries; consequently, the added functionality is not portable nor available to other platforms. Conversion support for Rich Text Format files is also provided.

UnicodeConverter is released and distributed under the GNU General Public License. Its homepage is at http://unicodeconvert.sourceforge.net. SYSTEM REQUIREMENTS

You will need to have the Java 2 Runtime Environment, Standard Edition (JRE) 1.4 or later installed on your machine to execute UnicodeConverter. J2RE can be downloaded free from http://java.sun.com/j2se/. The Java 2 Runtime Environment, Standard Edition (JRE) consists of the Java virtual machine, the Java platform core classes, and supporting files to allow you to run applications written in the Java programming language.

On Mac OS X Tiger or Panther, UnicodeConverter runs without additional requirements. For Jaguar 10.2.6 or later, Java 1.4.1 Update 1 can be installed.

To be able to convert Word or Excel documents, you'll need to be on a Windows system with Microsoft Word or Excel installed. Put the file jacob.dll in your path, for example, into the system32 or jre/bin folder. HOW TO RUN UnicodeConverter

UnicodeConverter is written in Java language and packaged as executable Java-Archive. Download and unzip UnicodeConverter-1.3.zip. UnicodeConverter.jar is the Java-Archive executable program to be run. You can run it either by double-clicking the UnicodeConverter.jar file or by executing the command uni at the command line to launch the program in GUI mode. Alternatively, the longer commands

java -jar UnicodeConverter.jar

or (on Windows)

javaw -jar UnicodeConverter.jar

will work, too. The filename is case-sensitive on some operating systems. Be sure the directory that contains the UnicodeConverter.jar file is the current directory.

Note: It is recommended that Microsoft Word/Excel not open any file when you convert Word/Excel documents. It may cause errors or slow down the conversion process.

Tip: Minimize the number of text boxes within Word documents to a few; having too many will slow down conversion significantly.

You can select single or multiple files, or a directory d for conversion. The resulting Unicode output files will be placed in a d_Unicode directory located at the same tree level as the source directory that contains the original files, which remain unchanged. You also can drag files or directory from native file manager and drop onto the application window to initiate conversion operation.

The program can also function as a command-line program, which is frequently used in batch file processing:

java -jar UnicodeConverter.jar < SourceEncoding > < SourceFile/Dir > < TargetFile/Dir >

where possible options for source encoding are VNI, VISCII, VPS, VIQR, TCVN3, and UNI-COMP. This functionality works for text-based files only, not Word/Excel documents.

Unicode composite (UNI-COMP) source text files should be saved in UTF-8 format for correct conversion to Unicode precomposed.

The default fonts for the output UTF-8 HTML files are Times New Roman, and Arial. Users can change to other Unicode-compliant fonts, using Unicode-compatible HTML editors such as FrontPage or Composer. Do not use Unicode-incompatible editors (such as Notepad of Win9x/Me) to edit UTF-8 files. Doing so would corrupt the UTF-8 byte sequence, rendering the characters or the file unreadable.

Use Firefox, Netscape, Internet Explorer (Windows), Opera, Mozilla, Safari, OmniWeb, or Chimera web browsers to view UTF-8 HTML files. You will not need to change their default settings; the tag tells the browsers to use Unicode UTF-8 character encoding in displaying the page. FILE PREPARATIONS FOR CONVERSION

To ensure successful conversion of HTML files in legacy formats and to minimize post-conversion editing, some pre-conversion conditioning may need to be performed on the source files. Changing the original document fonts to the more common ones with respect to its original encoding may be needed (see table below). Removing obsolete dynamic font links (.pfr or .eot) and associated ActiveX control scripts (e.g., tdserver.js) is also recommended, for leaving them in will needlessly slow down page download.

These basic editing tasks should be done prior to the actual conversion process and can be expeditiously performed by using MDI (multiple document interface) text editors which allow opening multiple files and performing global find/replace actions on all open files at once. CuteHTML, TextPad, UltraEdit, EditPlus, and EditPad are some text editors that sport such useful features. They can be searched and downloaded from http://www.download.com. Source Encoding Fonts for original HTML documents VNI VNI-Times, VNI Times, VNI-Aptima, VNI Aptima, VNI-Helve, VNI Helve VPS VPS Times, VPS Helv VISCII VI Times, VI Arial, HoangYen, MinhQu, PhuongThao, ThaHuong, UHo TCVN3 .VnTime, .VnTimeH, .VnArial, .VnArialH VIQR No font formatting

Note: Due to the nature of TCVN3 encoding, conversion of some Vietnamese capital vowels will result in incorrect, lower case. Some post-conversion editing may be necessary. UNICODE-COMPLIANT FONTS

Unicode has only limited support in Windows 95/98/Me, but they are still capable of displaying all Vietnamese characters using appropriate Unicode fonts. Full Unicode support is built into Windows NT/2000/XP. Linux and Mac OS 8.5 or greater have begun to provide support Unicode. Mac OS X and Palm OS provide full Unicode support.

The following TrueType fonts, which come supplied with Windows 98SE/Me/2000/XP, contain many Unicode characters, including Vietnamese:

Times New Roman, Courier New, Arial, Tahoma, Verdana, Palatino Linotype

This list of Unicode fonts is by no means comprehensive, as there are more and more fonts are being commercially developed or expanded to include Unicode characters.

Requirements:

� Java 1.4.2 or later

What's New in This Release:

� Refactored using Design Patterns to improve code reusability, program extensibility and maintainability � Updated JACOB library to version 1.9.1

UnicodeConverter 1.3

add to watchlist add to download basket send us an update REPORT
  runs on:
Linux
  filename:
UnicodeConverter-1.3.zip
  2 screenshots:
UnicodeConverter - screenshot #1UnicodeConverter - screenshot #2
  main category:
Text Editing&Processing
  developer:
  visit homepage

Windows Sandbox Launcher 1.0.0

Set up the Windows Sandbox parameters to your specific requirements, with this dedicated launcher that features advanced parametrization
Windows Sandbox Launcher

Microsoft Teams 24060.3102.2733.5911 Home / 1.7.00.7956 Work

Effortlessly chat, collaborate on projects, and transfer files within a business-like environment by employing this Microsoft-vetted application
Microsoft Teams

calibre 7.9.0

Effortlessly keep your e-book library thoroughly organized with the help of the numerous features offered by this efficient and capable manager
calibre

Context Menu Manager 3.3.3.1

Customize Windows’ original right-click context menu using this free, portable and open-source utility meant to enhance your workflow
Context Menu Manager

4k Video Downloader 1.5.3.0080 Plus / 4.30.0.5655

Export your favorite YouTube videos and playlists with this intuitive, lightweight program, built to facilitate downloading clips from the popular website
4k Video Downloader

Zoom Client 6.0.3.37634

The official desktop client for Zoom, the popular video conferencing and collaboration tool used by millions of people worldwide
Zoom Client

ShareX 16.0.1

Capture your screen, create GIFs, and record videos through this versatile solution that includes various other amenities: an OCR scanner, image uploader, URL shortener, and much more
ShareX

7-Zip 23.01 / 24.04 Beta

An intuitive application with a very good compression ratio that can help you not only create and extract archives, but also test them for errors
7-Zip

IrfanView 4.67

With support for a long list of plugins, this minimalistic utility helps you view images, as well as edit and convert them using a built-in batch mode
IrfanView

Bitdefender Antivirus Free 27.0.35.146

Feather-light and free antivirus solution from renowned developer that keeps the PC protected at all times from malware without requiring user configuration
Bitdefender Antivirus Free

% discount
7-Zip
  • 7-Zip
  • IrfanView
  • Bitdefender Antivirus Free
  • Windows Sandbox Launcher
  • Microsoft Teams
  • calibre
  • Context Menu Manager
  • 4k Video Downloader
  • Zoom Client
  • ShareX
essentials


Click to load comments
This enables Disqus, Inc. to process some of your data. Disqus privacy policy