view contrib/unicode2nginx/unicode-to-nginx.pl @ 8672:a2613fc1bce5

Fixed sendfile() limit handling on Linux. On Linux starting with 2.6.16, sendfile() silently limits all operations to MAX_RW_COUNT, defined as (INT_MAX & PAGE_MASK). This incorrectly triggered the interrupt check, and resulted in 0-sized writev() on the next loop iteration. Fix is to make sure the limit is always checked, so we will return from the loop if the limit is already reached even if number of bytes sent is not exactly equal to the number of bytes we've tried to send.
author Maxim Dounin <mdounin@mdounin.ru>
date Fri, 29 Oct 2021 20:21:51 +0300
parents 8752257e883f
children
line wrap: on
line source

#!/usr/bin/perl -w

# Convert unicode mappings to nginx configuration file format.

# You may find useful mappings in various places, including
# unicode.org official site:
#
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8-R.TXT

# Needs perl 5.6 or later.

# Written by Maxim Dounin, mdounin@mdounin.ru

###############################################################################

require 5.006;

while (<>) {
	# Skip comments and empty lines

	next if /^#/;
	next if /^\s*$/;
	chomp;

	# Convert mappings

	if (/^\s*0x(..)\s*0x(....)\s*(#.*)/) {
		# Mapping <from-code> <unicode-code> "#" <unicode-name>
		my $cs_code = $1;
		my $un_code = $2;
		my $un_name = $3;

		# Produce UTF-8 sequence from character code;

		my $un_utf8 = join('',
			map { sprintf("%02X", $_) }
			unpack("U0C*", pack("U", hex($un_code)))
		);

		print "    $cs_code  $un_utf8 ; $un_name\n";

	} else {
		warn "Unrecognized line: '$_'";
	}
}

###############################################################################