view contrib/unicode2nginx/unicode-to-nginx.pl @ 7829:2851e4c7de03

Mail: fixed reading with fully filled buffer (ticket #2159). With SMTP pipelining, ngx_mail_read_command() can be called with s->buffer without any space available, to parse additional commands received to the buffer on previous calls. Previously, this resulted in recv() being called with zero length, resulting in zero being returned, which was interpreted as a connection close by the client, so nginx silently closed connection. Fix is to avoid calling c->recv() if there is no free space in the buffer, but continue parsing of the already received commands.
author Maxim Dounin <mdounin@mdounin.ru>
date Wed, 21 Apr 2021 23:24:59 +0300
parents 8752257e883f
children
line wrap: on
line source

#!/usr/bin/perl -w

# Convert unicode mappings to nginx configuration file format.

# You may find useful mappings in various places, including
# unicode.org official site:
#
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT
# http://www.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8-R.TXT

# Needs perl 5.6 or later.

# Written by Maxim Dounin, mdounin@mdounin.ru

###############################################################################

require 5.006;

while (<>) {
	# Skip comments and empty lines

	next if /^#/;
	next if /^\s*$/;
	chomp;

	# Convert mappings

	if (/^\s*0x(..)\s*0x(....)\s*(#.*)/) {
		# Mapping <from-code> <unicode-code> "#" <unicode-name>
		my $cs_code = $1;
		my $un_code = $2;
		my $un_name = $3;

		# Produce UTF-8 sequence from character code;

		my $un_utf8 = join('',
			map { sprintf("%02X", $_) }
			unpack("U0C*", pack("U", hex($un_code)))
		);

		print "    $cs_code  $un_utf8 ; $un_name\n";

	} else {
		warn "Unrecognized line: '$_'";
	}
}

###############################################################################