From: George Rimar Date: Thu, 5 Oct 2017 08:15:55 +0000 (+0000) Subject: [MC] - llvm-mc hangs on non-english characters. X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=fbdce7000bfca57ab473b6bd5c13d0f7db7a68f7;p=llvm [MC] - llvm-mc hangs on non-english characters. Currently llvm-mc just hangs inside infinite loop while trying to parse file which has ".section .с" inside, where section name is non-english character. Patch fixes the issue. In this patch I also moved content of non-english-characters.s to test/MC/AsmParser/Inputs folder so that non-english-characters.s becomes a single testcase for all invalid inputs containing non-english symbols. That is convinent because llvm-mc otherwise tries to parse and tokenize the whole testcase file with tools invocations and it is harder to isolate the issue. Differential revision: https://reviews.llvm.org/D38545 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314973 91177308-0d34-0410-b5e6-96231b3b80d8 --- diff --git a/lib/MC/MCParser/ELFAsmParser.cpp b/lib/MC/MCParser/ELFAsmParser.cpp index a407691b0bd..8b8e96a4d51 100644 --- a/lib/MC/MCParser/ELFAsmParser.cpp +++ b/lib/MC/MCParser/ELFAsmParser.cpp @@ -247,7 +247,7 @@ bool ELFAsmParser::ParseSectionName(StringRef &SectionName) { return false; } - while (true) { + while (!getParser().hasPendingError()) { SMLoc PrevLoc = getLexer().getLoc(); if (getLexer().is(AsmToken::Comma) || getLexer().is(AsmToken::EndOfStatement)) diff --git a/test/MC/AsmParser/Inputs/non-english-characters-comments.s b/test/MC/AsmParser/Inputs/non-english-characters-comments.s new file mode 100644 index 00000000000..41711e72424 --- /dev/null +++ b/test/MC/AsmParser/Inputs/non-english-characters-comments.s @@ -0,0 +1,10 @@ +# 0bÑ +# 0xÑ +# .Ñ4 +# .XÑ +# .1Ñ +# .1eÑ +# 0x.Ñ +# 0x0pÑ +.intel_syntax +# 1Ñ diff --git a/test/MC/AsmParser/Inputs/non-english-characters-section-name.s b/test/MC/AsmParser/Inputs/non-english-characters-section-name.s new file mode 100644 index 00000000000..7e255d20601 --- /dev/null +++ b/test/MC/AsmParser/Inputs/non-english-characters-section-name.s @@ -0,0 +1 @@ +.section .ñ diff --git a/test/MC/AsmParser/non-english-characters.s b/test/MC/AsmParser/non-english-characters.s index 12d78ee83be..0e47a943bd3 100644 --- a/test/MC/AsmParser/non-english-characters.s +++ b/test/MC/AsmParser/non-english-characters.s @@ -1,14 +1,9 @@ -# RUN: llvm-mc -triple i386-linux-gnu -filetype=obj -o %t %s +# RUN: llvm-mc -triple i386-linux-gnu -filetype=obj -o %t \ +# RUN: %S/Inputs/non-english-characters-comments.s # RUN: llvm-readobj %t | FileCheck %s # CHECK: Format: ELF32-i386 -# 0bÑ -# 0xÑ -# .Ñ4 -# .XÑ -# .1Ñ -# .1eÑ -# 0x.Ñ -# 0x0pÑ -.intel_syntax -# 1Ñ +# RUN: not llvm-mc -triple i386-linux-gnu -filetype=obj -o %t \ +# RUN: %S/Inputs/non-english-characters-section-name.s 2>&1 | \ +# RUN: FileCheck %s --check-prefix=ERR +# ERR: invalid character in input