view contrib/unicode2nginx/unicode-to-nginx.pl @ 3293:8abb88374c6c

Fix a bug introduced in r2032: After a child process has read a terminate message from a channel, the process tries to read the channel again. The kernel (at least FreeBSD) may preempt the process and sends a SIGIO signal to a master process. The master process sends a new terminate message, the kernel switches again to the the child process, and the child process reads the messages instead of an EAGAIN error. And this may repeat over and over. Being that the child process can not exit the cycle and test the termination flag set by the message handler. The fix disallow the master process to send a new terminate message on SIGIO signal reception. It may send the message only on SIGALARM signal.
author Igor Sysoev <igor@sysoev.ru>
date Wed, 04 Nov 2009 19:41:08 +0000
parents 63a820b0bc6c
children 8752257e883f
line wrap: on
line source

#!/usr/bin/perl -w

# Convert unicode mappings to nginx configuration file format.

# You may find useful mappings in various places, including
# unicode.org official site:
#
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8-R.TXT

# Needs perl 5.6 or later.

# Written by Maxim Dounin, mdounin@rambler-co.ru

###############################################################################

require 5.006;

while (<>) {
	# Skip comments and empty lines

	next if /^#/;
	next if /^\s*$/;
	chomp;

	# Convert mappings

	if (/^\s*0x(..)\s*0x(....)\s*(#.*)/) {
		# Mapping <from-code> <unicode-code> "#" <unicode-name>
		my $cs_code = $1;
		my $un_code = $2;
		my $un_name = $3;

		# Produce UTF-8 sequence from character code;

		my $un_utf8 = join('', map { sprintf("%02X", $_) } unpack("C*", pack("U", hex($un_code))));

		print "    $cs_code  $un_utf8 ; $un_name\n";

	} else {
		warn "Unrecognized line: '$_'";
	}
}

###############################################################################