WO2024148612A1 - 一种翻译器生成方法、装置、设备及存储介质 - Google Patents

一种翻译器生成方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2024148612A1
WO2024148612A1 PCT/CN2023/072163 CN2023072163W WO2024148612A1 WO 2024148612 A1 WO2024148612 A1 WO 2024148612A1 CN 2023072163 W CN2023072163 W CN 2023072163W WO 2024148612 A1 WO2024148612 A1 WO 2024148612A1
Authority
WO
WIPO (PCT)
Prior art keywords
rules
translator
dsl
rule
translated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2023/072163
Other languages
English (en)
French (fr)
Inventor
宋鹏飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to EP23915381.0A priority Critical patent/EP4650940A4/en
Priority to CN202380085841.4A priority patent/CN120476380A/zh
Priority to PCT/CN2023/072163 priority patent/WO2024148612A1/zh
Publication of WO2024148612A1 publication Critical patent/WO2024148612A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/37Compiler construction; Parser generation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/55Rule-based translation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/51Source to source

Definitions

  • the present application relates to the field of translators, and in particular to a translator generation method, apparatus, device and storage medium.
  • a translator is a tool that converts one computer language into another. Commonly used translators include compilers, assemblers, interpreters, etc.
  • the commonly used translator generation method is relatively complicated, and the generated translator can no longer meet the user's usage needs, so there is an urgent need to develop a new translator.
  • the present application provides a translator generation method, apparatus, device and storage medium.
  • the translator generation method is relatively simple.
  • the translator generated by the method described in the present application has strong applicability and can meet different needs of users.
  • the present application provides a translator generation method, comprising: obtaining a description file, the description file defining keywords involved in a first domain-specific language (DSL) and N rules for generating a translator, N being greater than or equal to 1, the description file being obtained based on a configuration, a first rule among the N rules including multiple target strings and at least one wildcard, the first rule being any one of the N rules; based on the description file, generating a translator and a configuration file corresponding to the first DSL, the configuration file including the N rules.
  • DSL domain-specific language
  • the description file is obtained according to the user configuration, wherein the description file defines the keywords involved in the first DSL and N rules for generating the translator, wherein the first rule of the N rules includes multiple target strings and at least one wildcard, and the description file obtained based on the configuration generates the translator and configuration file corresponding to the first DSL.
  • the method of expressing the description file in this application is simpler; the method is easy to operate, and the generated translator can meet the user's usage requirements.
  • generating a translator and a configuration file corresponding to the first DSL based on the description file includes: generating a fractal tree according to the N rules, each of the N rules is a branch of the fractal tree, and at least one wildcard in the first rule is a virtual root node on the branch where the first rule is located, and the virtual root node is different from the root node of the fractal tree; and generating the translator and the configuration file according to the fractal tree.
  • the present application proposes a new fractal tree structure, in which a virtual root node is set in the fractal tree, so that the N rules in the description file can be constructed into a tree, and when generating a translator, only one tree needs to be traversed.
  • N rules in the description file can be constructed into a tree
  • multiple trees are constructed according to N rules, and when generating a translator, multiple trees need to be traversed.
  • the time complexity and space complexity occupied by the method described in the present application are relatively small.
  • the keyword includes one or more of the name of the description file, the import rule of the description file, the type of the translator, and the language used by the translator.
  • the N rules include recognition rules of the second DSL
  • the method further includes: obtaining a text to be translated input by a user, where the text to be translated is written by the first DSL or the second DSL; and translating the text to be translated into a corresponding syntax tree by the translator.
  • the text to be translated is input into the translator, and the corresponding syntax tree can be obtained through the translator.
  • the generated translator can be used to translate the first DSL, and can also be used to translate the first DSL. For translating the second DSL.
  • translating the text to be translated into a corresponding syntax tree by the translator includes: inputting the configuration file into the translator; and translating the text to be translated into a corresponding syntax tree by the translator.
  • the configuration file can also be input into the translator, and the corresponding syntax tree can be obtained through the translator.
  • the method before inputting the configuration file into the translator, the method also includes: updating N rules in the configuration file to M rules, at least one of the M rules does not belong to the N rules, or at least one of the N rules does not belong to the M rules.
  • the configuration file is used for the user to modify the N rules, that is, when using the translator, the user can modify the rules in the configuration file according to actual needs, and input the modified configuration file into the translator, and the translator translates the input text to be translated.
  • the translator provided by the present application can meet the different needs of different users, is easy to operate, and has strong applicability.
  • the second DSL includes any one of a DBC description language, a CMake script language, and a GCOV description language.
  • the present application provides a translator generation device, characterized in that it includes:
  • an acquisition module configured to acquire a description file, wherein the description file defines keywords involved in the first DSL and N rules for generating a translator, where N is greater than or equal to 1, the description file is obtained based on a configuration, a first rule among the N rules includes multiple target strings and at least one wildcard, and the first rule is any one of the N rules;
  • a generating module is used to generate a translator and a configuration file corresponding to the first DSL based on the description file, wherein the configuration file includes the N rules.
  • the generation module is used to: generate a fractal tree according to the N rules, each of the N rules is a branch of the fractal tree, and at least one wildcard in the first rule is a virtual root node on the branch where the first rule is located, and the virtual root node is different from the root node of the fractal tree; generate the translator and the configuration file according to the fractal tree.
  • the keyword includes one or more of the name of the description file, the import rule of the description file, the type of the translator, and the language used by the translator.
  • the N rules include an identification rule of the second DSL
  • the device further includes a translation module
  • the acquisition module is further used to acquire a text to be translated input by a user, the text to be translated is written by the first DSL or the second DSL
  • the translation module is used to translate the text to be translated into a corresponding syntax tree through the translator.
  • the translation module is used to: input the configuration file into the translator; and translate the text to be translated into a corresponding syntax tree through the translator.
  • the device also includes: an updating module, used to update the N rules in the configuration file to M rules, at least one of the M rules does not belong to the N rules, or at least one of the N rules does not belong to the M rules.
  • the second DSL includes any one of a DBC description language, a CMake script language, and a GCOV description language.
  • Each functional module of the second aspect is used to implement the method described in the first aspect or any possible implementation method of the first aspect.
  • the present application provides a translator generation device, including a processor and a memory, wherein the memory is used to The processor is used to store instructions, and execute the instructions stored in the memory to execute the method described in the first aspect or any possible implementation manner of the first aspect.
  • the device is an in-vehicle computing platform or a server, wherein the server is a local server or a cloud server.
  • the present application provides a storage medium comprising program instructions, which, when executed by a processor, causes the processor to execute the method described in the first aspect or any possible implementation of the first aspect.
  • the present application provides a computer program product, comprising program instructions, which, when executed by a processor, causes the processor to execute the method described in the first aspect or any possible implementation of the first aspect.
  • FIG1 is a flow chart of a translator generation method provided by the present application.
  • FIG2 is an example of a description file grammar rule provided by the present application.
  • FIG3 is a partial flow chart of a translator generation method provided by the present application.
  • FIG4 is a schematic diagram of the structure of a fractal tree provided by the present application.
  • FIG5 is an example diagram provided by the present application.
  • FIG6 is a schematic diagram of the structure of a translator generating device provided by the present application.
  • FIG. 7 is a schematic diagram of the structure of a translator generation device provided in the present application.
  • the present application provides a translator generation method.
  • the translator generation method provided by the present application can be applied in the automotive field, for example, in an on-board computing platform.
  • the on-board computing platform generates a translator according to the method described in the present application, and uses the translator to parse the domain language involved in the controller area network (CAN) bus.
  • the method can also be applied to other fields, for example, it can be applied to natural language processing, integrated development environment (IDE), information retrieval and other fields.
  • IDE integrated development environment
  • the method described in the present application can be executed by an on-board computing platform or by a server.
  • the server can be a local server, such as a local desktop computer, or a server located in the cloud, such as a central server or an edge server.
  • the cloud can be a public cloud, a private cloud, or a hybrid cloud.
  • Figure 1 is a flow chart of a translator generation method provided by the present application. The method includes but is not limited to the following description.
  • the first DSL is a language developed by the user.
  • the description file may also use an existing DSL, that is, the first DSL may also be an existing DSL.
  • the keywords involved in the first domain language refer to words with special meanings defined by users, which are identifiers with special meanings in the first DSL and are also called reserved words.
  • Keywords may include one or more of the name of the description file, the import rules of the description file, the type of the translator, and the language used by the translator.
  • module may be used to represent the name of the description file, such as module A representing the description file.
  • the name of the description file is A
  • import is used to represent the import rules of the description file, for example, import B represents the import of description file B
  • generate is used to represent the type of the generated translator, for example, generatelexer represents the type of the generated translator is a lexical parser, and generateparser represents the type of the generated translator is a syntax parser
  • the language used by the translator refers to the programming language used when generating the translator
  • the keyword may also include optional configuration items, which refer to items that can be selectively configured by the user according to actual needs.
  • the optional configuration items may include the language used by the translator.
  • the optional configuration items may be represented by options, and the language used by the translator may be represented by language in options.
  • Other keywords may also be defined in the optional configuration items, which are not limited in this application.
  • the keywords involved in the first DSL may also be other keywords, which are not limited in this application.
  • N rules for generating the translator are defined, where N is an integer greater than or equal to 1.
  • the translator to be generated needs to have the functions of a lexical parser and a grammatical parser, and the N rules may include lexical parsing rules and grammatical parsing rules.
  • the lexical parser is used to segment the input character sequence
  • the grammatical parser is used to determine the association relationship between each segmentation according to the segmentation result.
  • the first rule among the N rules includes multiple target strings and at least one wildcard.
  • the first rule is any one of the N rules.
  • at least one of the N rules includes multiple target strings and at least one wildcard.
  • a character string (character string, referred to as string) is a string of characters consisting of numbers, letters and underscores.
  • a character string is a data type that represents text in a programming language.
  • a wildcard is a special statement used to fuzzy search files. When searching for a target file, when you don't know the real characters of the target file or don't want to type the complete characters, you often use wildcards to replace one or more real characters.
  • wildcards two kinds are defined in the description file, one is "*", and the wildcard * represents any number of arbitrary characters, and the arbitrary number can be zero, one, or more.
  • a*b represents a string that starts with character a and ends with character b and contains any number of arbitrary characters in the middle.
  • Another wildcard is "?”, and the wildcard ? represents any character, for example, a? b represents a string that starts with character a and ends with character b and contains any character in the middle.
  • the wildcard when a wildcard is included in the first rule, the wildcard can be any one of the two wildcards; when the first rule includes multiple wildcards, the multiple wildcards can all be the same wildcard, for example, the multiple wildcards included in the first rule can all be "*" or "?”; the multiple wildcards in the first rule can also include both "*" and "?”.
  • module, import, generate, options, and language are all keywords defined in the description file, where moduletest indicates that the name of the description file is test, importbase_rule indicates that the description file base_rule is imported into the description file test, generatelexer indicates that the type of the generated translator is a word segmentation parser lexer, options indicates an optional configuration item, and the language language used by the translator configured in options is xxx.
  • the description file is obtained based on the user configuration, which includes keywords and N rules for generating the translator. It should be noted that the description file does not include logic code, which refers to the code that performs logical operations, such as code guided by statements such as if, else, and when.
  • S102 Generate a translator and a configuration file corresponding to the first DSL based on the description file, where the configuration file includes N rules.
  • a translator and a configuration file corresponding to the first DSL are generated.
  • the configuration file includes N rules, and the N rules in the configuration file can be modified by the user.
  • the user can translate the input text to be translated based on the N rules, or can modify the rules in the configuration file and translate the input text to be translated based on the modified rules.
  • the generated translator can be used to translate the first DSL. For example, a text to be translated is input into the translator, where the text to be translated is written in the first DSL, and the translator can translate the text to be translated into a syntax tree corresponding to the first DSL.
  • the generated syntax tree can be used for other applications.
  • the generated translator can also be used to translate other languages.
  • the N rules in the description file also include recognition rules for the second DSL.
  • the recognition rules for the second DSL include lexical parsing rules and grammatical parsing rules, and the generated translator can also be used to translate the second DSL.
  • the translator can translate the text to be translated into a grammatical tree corresponding to the second DSL, and the generated grammatical tree can be used for other applications.
  • the translator may directly translate the text to be translated into a corresponding syntax tree, wherein the text to be translated may be written based on the first DSL or the second DSL.
  • a configuration file may also be input into the translator, and the translator translates the text to be translated into a corresponding syntax tree based on the rules in the configuration file, wherein the text to be translated may be written based on the first DSL or the second DSL.
  • the N rules in the configuration file may be updated to M rules, specifically including any one or a combination of any number of the following: adding new rules based on the N rules; deleting at least one rule in the N rules; modifying at least one rule in the N rules. That is, at least one rule in the M rules does not belong to the N rules, or at least one rule in the N rules does not belong to the M rules.
  • the configuration file is input into a translator, and the translator translates the text to be translated based on the updated rules to obtain a corresponding syntax tree.
  • the second DSL mentioned above can be any one of the DBC description language, CMake script language, and GCOV description language.
  • DBC is database CAN, which means CAN message database.
  • DBC defines the relevant information of CAN communication
  • DBC description language refers to the language used to describe CAN messages.
  • CMake is a cross-platform installation (compilation) tool that can use simple statements to describe the installation (compilation) process of all platforms, where simple statements are scripts.
  • CMake script language refers to the language used by CMake scripts.
  • CMake script language can be, for example, CMakeLists, CMakeCache or others.
  • GCOV is GNU Coverage, which means GNU coverage report, where GNU is an open source organization for calculating code coverage.
  • Code coverage is a metric in software testing that describes the proportion and degree of source code being tested in a program.
  • GCOV description language refers to the language used to describe code coverage.
  • the second DSL in this application can also be other languages, which are not limited in this application.
  • the description file is obtained based on the configuration and does not include logic code.
  • the method of representing the description file in the present application is simpler; according to the method described in the present application, a translator and a configuration file are generated, wherein the configuration file can be used for users to modify the rules.
  • the rules in the configuration file can be modified, and the modified configuration file can be input into the translator to translate the text to be translated.
  • the translator generated by the method described in the present application can meet the different needs of different users, and the translator has strong applicability; when the translator generated by the method of the present application translates the text to be translated, the text to be translated can be directly input into the translator for translation, which is simple to operate and easy to use; the translator generated by the present application can support translation of multiple languages.
  • a translator and a configuration file corresponding to the first DSL are generated.
  • this can be implemented by a fractal tree.
  • the method includes but is not limited to the description of the following contents.
  • each of the N rules is a branch of the fractal tree, and at least one wildcard in a first rule is a virtual root node on the branch where the first rule is located.
  • Each of the N rules is a branch of the fractal tree, and the fractal tree includes N branches in total. If the first rule includes several wildcards, there will be several virtual root nodes on the branch where the first rule is located. For example, if the first rule includes one wildcard, there will be one virtual root node on the branch where the first rule is located; if the first rule includes two wildcards, there will be two virtual root nodes on the branch where the first rule is located, and so on.
  • the virtual root node is different from the root node.
  • the root node refers to the node where the root of the entire fractal tree is located.
  • the entire fractal tree only includes one root node.
  • the virtual root node is the root node of the branch composed of the target string after the wildcard.
  • a branch can include zero or one or more virtual root nodes, and the entire fractal tree can include one or more virtual root nodes.
  • FIG 4 is a schematic diagram of the structure of a fractal tree provided by the present application.
  • q0 is the root node of the fractal tree
  • the content on the root node can be one or more target strings, or a null character.
  • "-1" is used to represent a virtual root node
  • the content on the virtual root node is a wildcard.
  • the content on other nodes can be one or more target strings, or a null character.
  • the fractal tree shown in Figure 4 includes eight branches, each of which corresponds to a rule.
  • the first branch corresponds to the rule "q0q1q3 first wildcard q10q14q19", where the wildcard on the first branch is called the first wildcard, which can be "*" or "?".
  • the second branch corresponds to the rule "q0q1q3 first wildcard q10q14q20q25”
  • the third branch corresponds to the rule "q0q1q3 first wildcard q10q14q20q26”
  • the fourth branch corresponds to the rule "q0q1q3 first wildcard q10q15q21”
  • the fifth branch corresponds to the rule "q0q1q3 first wildcard q11q16q22”
  • the sixth branch corresponds to the rule "q0q1q3q7q12q17q23”
  • the seventh branch corresponds to the rule "q0q1q4q8”
  • the eighth branch corresponds to the rule "q0q2q5q9 second wildcard q18q24", where the wildcard on the eighth branch is called the second wildcard, and the second wildcard can be "*" or "?".
  • the wildcard on the first branch first wildcard
  • the wildcard on the eighth branch second wildcard
  • the first branch, the second branch, the third branch, the fourth branch, the fifth branch and the eighth branch each include only one wildcard, and the sixth branch and the seventh branch do not include a wildcard.
  • each branch of the fractal tree may include more or fewer wildcards, that is, each branch may include more or fewer virtual root nodes.
  • the schematic diagram of FIG4 is only used for example and does not constitute a limitation of the present application.
  • the virtual root node is the root node of the branch composed of the target string after the wildcard.
  • the virtual root node where the first wildcard is located is the root node of the branch "first wildcard q10q14q19", the root node of the branch "first wildcard q10q14q20q25", the root node of the branch “first wildcard q10q14q20q26", the root node of the branch "first wildcard q10q15q21", and the root node of the branch "first wildcard q11q16q22”.
  • the virtual root node where the second wildcard is located is the root node of the branch "second wildcard q18q24".
  • an AC automaton can be constructed, and based on the AC automaton, a translator and a configuration file can be generated.
  • a failure pointer needs to be set.
  • the failure pointer can point to the root node or to a virtual root node.
  • the direction of the failure pointer can be set according to specific circumstances.
  • Figure 5 is an example diagram provided by the present application. From Figure 5, the failure pointer of some nodes is set to return to the root node, and the failure pointer of some nodes is set to return to the virtual root node.
  • Figure 5 is merely an example and does not constitute any limitation on the present application. In practical applications, the return node of the failure pointer can be set according to specific circumstances.
  • the present application provides a new fractal tree structure, in which a virtual root node is set.
  • N rules in the description file are constructed on a tree, and when generating a translator, it is only necessary to traverse the fractal tree once.
  • N rules are constructed on multiple trees, and multiple trees are traversed when generating a translator.
  • the method described in the present application is lightweight in the time dimension and the space dimension.
  • FIG. 6 is a schematic diagram of the structure of a translator generating device 600 provided in the present application.
  • the device 600 can be configured as a vehicle-mounted computing platform, or as a local server, such as a desktop computer, or as a cloud server, such as a central server, an edge server, or as a virtual machine or a container.
  • the device 600 includes:
  • An acquisition module 610 is used to acquire a description file, where the description file defines keywords involved in the first DSL and N rules for generating a translator, where N is greater than or equal to 1, and the description file is obtained based on a configuration, where a first rule among the N rules includes multiple target strings and at least one wildcard, and the first rule is any one of the N rules;
  • the generating module 620 is used to generate a translator and a configuration file corresponding to the first DSL based on the description file, where the configuration file includes N rules.
  • the generation module 620 is used to: generate a fractal tree according to N rules, each of the N rules is a branch of the fractal tree, and at least one wildcard in the first rule is a virtual root node on the branch where the first rule is located, and the virtual root node is different from the root node of the fractal tree; generate a translator and a configuration file according to the fractal tree.
  • the keyword includes one or more of the name of the description file, the import rule of the description file, the type of the translator, and the language used by the translator.
  • the N rules include recognition rules of the second DSL
  • the acquisition module 620 is further used to acquire a text to be translated input by a user, where the text to be translated is written by the first DSL or the second DSL
  • the apparatus 600 further includes: a translation module 630, which is used to translate the text to be translated into a corresponding syntax tree by a translator.
  • the translation module 630 is used to: input the configuration file into a translator; and translate the text to be translated into a corresponding syntax tree through the translator.
  • the apparatus 600 further includes: an updating module 640, configured to update the N rules in the configuration file to M rules, wherein at least one rule of the M rules does not belong to the N rules.
  • the second DSL includes any one of a DBC description language, a CMake script language, and a GCOV description language.
  • FIG. 6 The various functional modules in FIG. 6 are used to implement the methods described in the method embodiments of FIG. 1 to FIG. 5 .
  • the division of the functional modules in FIG. 6 and the corresponding execution steps of the functional modules are merely an example. In other embodiments, the device 600 may be further divided into more or fewer functional modules according to specific execution steps.
  • the translator generating device 700 can be a vehicle-mounted computing platform, or a local server or a cloud server, such as a desktop computer, an edge server, a central server, etc., or a virtual machine or a container.
  • the translator generating device 700 includes at least one processor 701 and a communication interface 703 , and optionally, further includes a memory 702 .
  • the processor 701 , the memory 702 and the communication interface 703 are interconnected via a bus 704 .
  • the memory 702 includes, but is not limited to, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or a portable read-only memory (CD-ROM).
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • CD-ROM portable read-only memory
  • the memory 702 is used for related computer programs and data.
  • the communication interface 703 is used for receiving and sending data.
  • the processor 701 in the translator generating device 700 is used to read the computer program code stored in the memory 702 and perform the following operations:
  • a description file is obtained, where the description file defines keywords involved in the first DSL and N rules for generating a translator, where N is greater than or equal to 1, and the description file is obtained based on a configuration, where a first rule among the N rules includes multiple target strings and at least one wildcard, and the first rule is any one of the N rules; based on the description file, a translator and a configuration file corresponding to the first DSL are generated, where the configuration file includes the N rules.
  • generating a translator and a configuration file corresponding to the first DSL includes: generating a fractal tree according to N rules, each of the N rules is a branch of the fractal tree, and at least one wildcard in the first rule is a virtual root node on the branch where the first rule is located, and the virtual root node is different from the root node of the fractal tree; generating the translator and the configuration file according to the fractal tree.
  • the N rules include recognition rules of the second DSL
  • the method further includes: obtaining a text to be translated input by a user, where the text to be translated is written by the first DSL or the second DSL; and translating the text to be translated into a corresponding syntax tree by a translator.
  • translating the text to be translated into a corresponding syntax tree by a translator includes: inputting a configuration file into the translator; and translating the text to be translated into a corresponding syntax tree by the translator.
  • the method before inputting the configuration file into the translator, the method further includes: updating N rules in the configuration file to M rules, at least one of the M rules does not belong to the N rules, or at least one of the N rules does not belong to the M rules.
  • the second DSL includes any one of a DBC description language, a CMake script language, and a GCOV description language.
  • the processor 701 in the embodiment of the present application can be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASIC), field programmable gate arrays (FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof.
  • the general-purpose processor can be a microprocessor or any conventional processor.
  • the processor 701 is a CPU, the CPU can be a single-core CPU or a multi-core CPU.
  • the present application also provides a storage medium, including program instructions, which, when executed by a processor, causes the processor to execute the above-mentioned translator generation method.
  • the present application also provides a computer program product, which may be software or a program product containing instructions and capable of running on a computing device or stored in any available medium.
  • a computer program product which may be software or a program product containing instructions and capable of running on a computing device or stored in any available medium.
  • the processor executes the above-mentioned translator generation method.
  • the method steps in the embodiments of the present application can be implemented by hardware or by a processor executing software instructions.
  • the software instructions can be composed of corresponding software modules, and the software modules can be stored in random access memory, flash memory, read-only memory, programmable read-only memory, erasable programmable read-only memory, electrically erasable programmable read-only memory, registers, hard disks, mobile hard disks, CD-ROMs, or any other form of storage medium known in the art.
  • An exemplary storage medium is coupled to a processor so that the processor can read information from the storage medium and write information to the storage medium.
  • the storage medium can also be a component of the processor.
  • Processor and storage medium May be located in an ASIC.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)
  • Stored Programmes (AREA)

Abstract

本申请提供一种翻译器生成方法、装置、设备及存储介质,所述方法涉及翻译器领域,所述方法包括:获取描述文件,描述文件定义了第一DSL中涉及的关键字和生成翻译器的N个规则,N大于或等于1,描述文件是基于配置得到的,N个规则中的第一规则包括多个目标字符串和至少一个通配符,第一规则为N个规则中的任意一个;基于描述文件,生成第一DSL对应的翻译器以及配置文件,配置文件包括N个规则。采用本申请所述的方法生成的翻译器,适用性强,能够满足用户的不同需求。

Description

一种翻译器生成方法、装置、设备及存储介质 技术领域
本申请涉及翻译器领域,尤其涉及一种翻译器生成方法、装置、设备及存储介质。
背景技术
翻译器是将一种计算机语言转换为另一种计算机语言的工具,常用的翻译器包括编译器、汇编器、解释器等。
目前,常用的翻译器的生成方法比较繁琐,且生成的翻译器已经不能满足用户的使用需求,因此亟需研发一种新的翻译器。
发明内容
本申请提供了一种翻译器生成方法、装置、设备及存储介质,所述翻译器生成方法较简单,采用本申请所述方法生成的翻译器,适用性强,能够满足用户的不同需求。
第一方面,本申请提供了一种翻译器生成方法,包括:获取描述文件,所述描述文件定义了第一领域特定语言(domain-specific language,DSL)中涉及的关键字和生成翻译器的N个规则,N大于等于1,所述描述文件是基于配置得到的,所述N个规则中的第一规则包括多个目标字符串和至少一个通配符,所述第一规则为所述N个规则中的任意一个;基于所述描述文件,生成所述第一DSL对应的翻译器以及配置文件,所述配置文件包括所述N个规则。
可以看到,本申请中,描述文件是根据用户配置得到的,其中,描述文件中定义了第一DSL涉及的关键字以及生成翻译器的N个规则,其中,N个规则中的第一规则包括多个目标字符串和至少一个通配符,基于配置获得的描述文件生成第一DSL对应的翻译器以及配置文件。相比于用正则表达式编写描述文件,本申请中描述文件的表示方法更加简单;所述方法操作方便,生成的翻译器能够满足用户的使用需求。
基于第一方面,在可能的实现方式中,所述基于所述描述文件,生成所述第一DSL对应的翻译器以及配置文件,包括:根据所述N个规则生成分形树,所述N个规则中的每个规则为所述分形树的一个分支,且所述第一规则中的至少一个通配符为所述第一规则所在的分支上的虚拟根节点,所述虚拟根节点与所述分形树的根节点不同;根据所述分形树生成所述翻译器以及所述配置文件。
可以看到,本申请提出了一种新的分形树结构,在分形树中设置了虚拟根节点,从而能够将描述文件中的N个规则构造成一棵树,生成翻译器时,只需遍历一棵树即可。相比于采用传统的分形树结构而言,根据N个规则构造多棵树,生成翻译器时,需要遍历多棵树。本申请所述方法占用的时间复杂度和空间复杂度较小。
基于第一方面,在可能的实现方式中,所述关键字包括所述描述文件的名称,所述描述文件的导入规则,所述翻译器的类型,所述翻译器所使用的语言中的一种或多种。
基于第一方面,在可能的实现方式中,所述N个规则中包括第二DSL的识别规则,所述方法还包括:获取用户输入的待翻译文本,所述待翻译文本是通过所述第一DSL编写的或者是通过所述第二DSL编写的;通过所述翻译器,将所述待翻译文本翻译为对应的语法树。
可以理解,将待翻译文本输入翻译器,通过翻译器即可获得对应的语法树。另外,N个规则中包括第二DSL的识别规则的情况下,生成的翻译器可以用于翻译第一DSL,也可以用 于翻译第二DSL。
基于第一方面,在可能的实现方式中,所述通过所述翻译器,将所述待翻译文本翻译为对应的语法树,包括:将所述配置文件输入所述翻译器;通过所述翻译器,将所述待翻译文本翻译为对应的语法树。
可以理解,将待翻译文本输入翻译器后,还可将配置文件输入翻译器,通过翻译器即可获得对应的语法树。
基于第一方面,在可能的实现方式中,将所述配置文件输入所述翻译器之前,所述方法还包括:将所述配置文件中的N个规则更新为M个规则,所述M个规则中的至少一个规则不属于所述N个规则,或者所述N个规则中的至少一个规则不属于所述M个规则。
可以看到,配置文件用于供用户对N个规则进行修改,即,在使用翻译器时,用户可以根据实际需求修改配置文件中的规则,并将修改后的配置文件输入翻译器,翻译器对输入的待翻译文本进行翻译。本申请提供的翻译器能够满足不同用户的不同需求,操作简便,适用性较强。
基于第一方面,在可能的实现方式中,所述第二DSL包括DBC描述语言、CMake脚本语言、GCOV描述语言中的任意一种。
第二方面,本申请提供了一种翻译器生成装置,其特征在于,包括:
获取模块,用于获取描述文件,所述描述文件定义了第一DSL中涉及的关键字和生成翻译器的N个规则,N大于等于1,所述描述文件是基于配置得到的,所述N个规则中的第一规则包括多个目标字符串和至少一个通配符,所述第一规则为所述N个规则中的任意一个;
生成模块,用于基于所述描述文件,生成所述第一DSL对应的翻译器以及配置文件,所述配置文件包括所述N个规则。
基于第二方面,在可能的实现方式中,所述生成模块用于:根据所述N个规则生成分形树,所述N个规则中的每个规则为所述分形树的一个分支,且所述第一规则中的至少一个通配符为所述第一规则所在的分支上的虚拟根节点,所述虚拟根节点与所述分形树的根节点不同;根据所述分形树生成所述翻译器以及所述配置文件。
基于第二方面,在可能的实现方式中,所述关键字包括所述描述文件的名称,所述描述文件的导入规则,所述翻译器的类型,所述翻译器所使用的语言中的一种或多种。
基于第二方面,在可能的实现方式中,所述N个规则中包括第二DSL的识别规则,所述装置还包括翻译模块,所述获取模块还用于,获取用户输入的待翻译文本,所述待翻译文本是通过所述第一DSL编写的或者是通过所述第二DSL编写的;所述翻译模块,用于通过所述翻译器,将所述待翻译文本翻译为对应的语法树。
基于第二方面,在可能的实现方式中,所述翻译模块用于:将所述配置文件输入所述翻译器;通过所述翻译器,将所述待翻译文本翻译为对应的语法树。
基于第二方面,在可能的实现方式中,所述装置还包括:更新模块,用于将所述配置文件中的N个规则更新为M个规则,所述M个规则中的至少一个规则不属于所述N个规则,或者所述N个规则中的至少一个规则不属于所述M个规则。
基于第二方面,在可能的实现方式中,所述第二DSL包括DBC描述语言、CMake脚本语言、GCOV描述语言中的任意一种。
第二方面的各个功能模块用于实现上述第一方面或第一方面的任意一种可能的实现方式所述的方法。
第三方面,本申请提供了一种翻译器生成设备,包括处理器和存储器,所述存储器用于 存储指令,所述处理器用于执行所述存储器中存储的指令,以执行上述第一方面或第一方面的任意一种可能的实现方式所述的方法。
基于第三方面,在可能的实现方式中,所述设备为车载计算平台或服务器,其中,所述服务器为本地服务器或云端服务器。
第四方面,本申请提供了一种存储介质,包括程序指令,当所述程序指令被处理器执行时,使得所述处理器执行上述第一方面或第一方面的任意一种可能的实现方式所述的方法。
第五方面,本申请提供了一种计算机程序产品,包括程序指令,当所述程序指令被处理器执行时,使得所述处理器执行上述第一方面或第一方面的任意一种可能的实现方式所述的方法。
附图说明
图1为本申请提供的一种翻译器生成方法的流程示意图;
图2为本申请提供的一种描述文件语法规则示例;
图3为本申请提供的一种翻译器生成方法的部分流程示意图;
图4为本申请提供的一种分形树的结构示意图;
图5为本申请提供的一种示例图;
图6为本申请提供的一种翻译器生成装置的结构示意图;
图7为本申请提供的一种翻译器生成设备的结构示意图。
具体实施方式
本申请提供了一种翻译器生成方法,在介绍本申请提供的翻译器生成方法之前,先介绍一下本申请所述方法涉及的应用场景。本申请所提供的翻译器生成方法可以应用在车领域,比如,应用在车载计算平台,车载计算平台根据本申请所述的方法生成翻译器,利用翻译器解析控制器局域网(controller area network,CAN)总线中涉及的领域语言。所述方法还可以应用于其他领域,比如,可以应用于自然语言处理、集成开发环境(integrated development environment,IDE)、信息检索等领域。
本申请所述方法可以由车载计算平台执行,也可以由服务器执行。其中,服务器可以是本地服务器,例如本地台式计算机,也可以是位于云端的服务器,例如,中心服务器、边缘服务器,云可以是公有云,也可以是私有云,也可以是混合云。下面介绍本申请提供的一种翻译器生成方法,参见图1,图1为本申请提供的一种翻译器生成方法的流程示意图,所述方法包括但不限于以下内容的描述。
S101、获取描述文件,描述文件定义了第一领域特定语言DSL中涉及的关键字和生成翻译器的N个规则,描述文件是基于配置得到的,N个规则中的第一规则包括多个目标字符串和至少一个通配符,第一规则为N个规则中的任意一个。
第一DSL是用户自己开发的一种语言。可选的,在一种实现方式中,描述文件也可以采用已有的DSL,即,第一DSL也可以为已有的DSL。
第一领域语言涉及的关键字(keyword),指的是用户定义的具有特殊含义的单词,是第一DSL中具有特别意义的标识符,又可称为保留字。
关键字可以包括描述文件的名称、描述文件的导入规则、翻译器的类型、翻译器所使用的语言中的一种或多种。例如,可以用module表示描述文件的名称,比如module A表示描 述文件的名称为A;用import表示描述文件的导入规则,比如import B表示导入描述文件B;用generate表示生成的翻译器的类型,比如,generatelexer表示生成的翻译器的类型为词法解析器,generateparser表示生成的翻译器的类型为语法解析器;翻译器所使用的语言指的是生成翻译器时所使用的编程语言,可以用language表示翻译器所使用的语言,比如,language=C++表示生成翻译器时所使用的编程语言为C++语言,language=Java表示生成翻译器时所使用的编程语言为Java语言,language=Python表示生成翻译器时所使用的编程语言为Python语言。
可选的,关键字还可以包括可选配置项,可选配置项指的是可以由用户根据实际需求选择性地进行配置的项,比如可选配置项可以包括翻译器所使用的语言,例如,可以用options表示可选配置项,在options中可以用language表示翻译器所使用的语言。可选配置项中还可以定义其他关键字,本申请不做限定。第一DSL涉及的关键字还可以是其他关键字,本申请不做限定。
根据待生成的翻译器的具体功能,定义生成翻译器的N个规则,其中N为大于或等于1的整数。例如,在一种示例中,待生成的翻译器需要具有词法解析器和语法解析器的功能,则N个规则中可以包括词法解析规则和语法解析规则。其中,词法解析器用于对输入的字符序列进行分词,语法解析器用于根据分词后的结果确定各个分词之间的关联关系。
N个规则中的第一规则包括多个目标字符串和至少一个通配符,第一规则为N个规则中的任意一个规则,换句话说,N个规则中有至少一个规则中包括多个目标字符串和至少一个通配符。其中,字符串(character string,简称串(string))是由数字、字母和下划线组成的一串字符,字符串是编程语言中表示文本的数据类型。通配符,是一种特殊语句,用来模糊搜索文件。当搜索目标文件时,当不知道目标文件的真正字符或者不想键入完整字符时,常常使用通配符代替一个或多个真正字符。
本申请,描述文件中定义了两种通配符,一种是“*”,通配符*表示任意数量的任意字符,任意数量可以是零个、一个、多个。例如,a*b表示以字符a开始以字符b结尾中间包含任意数量的任意字符的字符串。另一种通配符是“?”,通配符?表示任意一个字符,例如,a?b表示以字符a开始以字符b结尾中间包含任意一个字符的字符串。可以理解,当第一规则中包括一个通配符时,该通配符可以是两种通配符中的任意一种;当第一规则中包括多个通配符时,所述多个通配符可以均是同一种通配符,比如,第一规则中包括的多个通配符可以均是“*”,也可以均是“?”;第一规则中的多个通配符也可以既包括“*”又包括“?”。
参见图2,图2为本申请提供的一种描述文件语法规则示例。图2中,module、import、generate、options、language均为描述文件中定义的关键字,其中,moduletest表示描述文件的名称为test,importbase_rule表示在描述文件test中导入了描述文件base_rule,generatelexer表示生成翻译器的类型为分词解析器lexer,options表示可选配置项,在options中配置的翻译器所使用的语言language为xxx。
描述文件是基于用户配置得到的,配置包括关键字和生成翻译器的N个规则。需要说明的是,描述文件中不包括逻辑代码,逻辑代码指的是执行逻辑运算操作的代码,逻辑代码例如包括if、else、when等语句引导的代码。
S102、基于描述文件,生成第一DSL对应的翻译器以及配置文件,配置文件包括N个规则。
基于描述文件中定义的关键字和生成翻译器的N个规则,生成第一DSL对应的翻译器和配置文件,配置文件中包括N个规则,配置文件中的N个规则可供用户进行修改。例如,后 续,用户使用翻译器时,可以基于这N个规则对输入的待翻译文本进行翻译,也可以修改配置文件中的规则,基于修改后的规则对输入的待翻译文本进行翻译。
生成的翻译器可用于对第一DSL进行翻译。例如,将待翻译文本输入翻译器中,其中待翻译文本是通过第一DSL编写的,翻译器可以将待翻译文本翻译为第一DSL对应的语法树。生成的语法树可用于其他应用。
生成的翻译器还可用于对其他语言进行翻译。在一种实现方式中,描述文件中的N个规则中还包括第二DSL的识别规则,例如,若生成的翻译器具有词法解析器和语法解析器的功能,则第二DSL的识别规则包括词法解析规则和语法解析规则,则生成的翻译器还可用于对第二DSL进行翻译。比如,将通过第二DSL编写的待翻译文本输入生成的翻译器中,翻译器可以将待翻译器文本翻译为第二DSL对应的语法树,生成的语法树可以用于其他应用。
在一种实现方式中,将待翻译文本输入翻译器后,翻译器可直接将待翻译文本翻译为对应的语法树,其中,待翻译文本可以是基于第一DSL编写的,也可以是基于第二DSL编写的。
在又一种实现方式中,将待翻译文本输入翻译器后,还可以将配置文件输入翻译器,翻译器基于配置文件中的规则将待翻译文本翻译为对应的语法树,其中,待翻译文本可以是基于第一DSL编写的,也可以是基于第二DSL编写的。
可选的,可以将配置文件中的N个规则更新为M个规则,具体包括以下中的任意一项或任意多项的组合:在N个规则的基础上,增加新的规则;将N个规则中的至少一个规则删除;对N个规则中的至少一条规则进行修改。即,M个规则中的至少一个规则不属于N个规则,或者N个规则中的至少一个规则不属于M个规则。在又一种实现方式中,将配置文件中的N个规则更新为M个规则之后,将配置文件输入翻译器中,翻译器基于更新后的规则对待翻译文本进行翻译,获得对应的语法树。
可选的,上述第二DSL可以是DBC描述语言、CMake脚本语言、GCOV描述语言中的任意一种。其中,DBC即database CAN,表示CAN报文数据库,DBC中定义了CAN通信的相关信息,DBC描述语言指的是用来描述CAN报文的语言。CMake是一个跨平台的安装(编译)工具,可以用简单的语句来描述所有平台的安装(编译)过程,其中简单的语句即为脚本,CMake脚本语言指的是CMake脚本所使用的语言,CMake脚本语言例如可以是CMakeLists、CMakeCache或者其他。GCOV即GNU Coverage,表示GNU覆盖率报告,其中GNU是一个用于计算代码覆盖率的开源组织。代码覆盖率(code coverage),是软件测试中的一种度量,描述程序中源代码被测试的比例和程度。GCOV描述语言指的是用于描述代码覆盖率的语言。本申请中第二DSL还可以是其他语言,本申请不做限定。
可以看到,本申请中,描述文件是基于配置得到的且不包括逻辑代码,相比于采用正则表达式表示描述文件,本申请中描述文件的表示方法更加简单;根据本申请所述的方法,生成翻译器和配置文件,其中,配置文件可用于供用户修改规则,当用户需求发生变化时,可以通过修改配置文件中的规则,将修改后的配置文件输入翻译器,对待翻译文本进行翻译,采用本申请所述的方法生成的翻译器,可以满足不同用户的不同需求,翻译器适用性强;根据本申请方法生成的翻译器,对待翻译文本进行翻译时,直接将待翻译文本输入翻译器进行翻译即可,操作简单,使用方便;本申请所生成的翻译器可支持翻译多种语言。
基于描述文件,生成第一DSL对应的翻译器以及配置文件,在一种实现方式中,可以通过分形树来实现。参见图3所示,图3为本申请提供的一种翻译器生成方法的部分流程示意 图,所述方法包括但不限于以下内容的描述。
S1021、根据描述文件中定义的N个规则,生成分形树,其中,N个规则中的每个规则为分形树的一个分支,且第一规则中的至少一个通配符为第一规则所在的分支上的虚拟根节点。
N个规则中的每个规则为分形树的一个分支,则分形树共包括N个分支。第一规则中包括几个通配符,则第一规则所在的分支上就有几个虚拟根节点。例如,第一规则中包括一个通配符,则第一规则所在的分支上就有一个虚拟根节点,第一规则中包括两个通配符,则第一规则所在的分支上就有两个虚拟根节点,等等。
虚拟根节点不同于根节点,根节点指的是整个分形树的根部所在的节点,整个分形树只包括一个根节点。虚拟根节点为通配符后的目标字符串组成的分支的根节点,一个分支上可以包括零个或一个或多个虚拟根节点,整个分形树可以包括一个或多个虚拟根节点。
参见图4,图4为本申请提供的一种分形树的结构示意图。图4所示的分形树中,q0为分形树的根节点,根节点上的内容可以是一个或多个目标字符串,也可以是空字符。为了便于表示,用“-1”表示虚拟根节点,虚拟根节点上的内容为通配符。除了根节点和虚拟根节点之外,其他节点上的内容可以为一个或多个目标字符串,也可以为空字符。
图4所示的分形树中包括八个分支,每个分支对应一个规则。比如,第一分支对应规则“q0q1q3第一通配符q10q14q19”,其中,将第一分支上的通配符称为第一通配符,第一通配符可以是“*”也可以是“?”。又比如,第二分支对应规则“q0q1q3第一通配符q10q14q20q25”,第三分支对应规则“q0q1q3第一通配符q10q14q20q26”,第四分支对应规则“q0q1q3第一通配符q10q15q21”,第五分支对应规则“q0q1q3第一通配符q11q16q22”,第六分支对应规则“q0q1q3q7q12q17q23”,第七分支对应规则“q0q1q4q8”,第八分支对应规则“q0q2q5q9第二通配符q18q24”,其中,将第八分支上的通配符称为第二通配符,第二通配符可以是“*”也可以是“?”。这里,为了便于理解,将第一分支上的通配符(第一通配符)与第八分支上的通配符(第二通配符)通过第一、第二进行区分,实际应用中,第一通配符和第二通配符可以相同可以不同。
图4示意图中,第一分支、第二分支、第三分支、第四分支、第五分支和第八分支上均只包括一个通配符,第六分支和第七分支上不包括通配符,实际应用中,分形树的每个分支上可以包括更多或更少数量的通配符,即,每个分支上可以包括更多或更少数量的虚拟根节点。图4示意图仅仅用于举例,并不构成对本申请的限定。
可以理解,虚拟根节点为通配符后的目标字符串组成的分支的根节点。在图4示意图中,第一通配符所在的虚拟根节点是“第一通配符q10q14q19”这个分支的根节点,也是“第一通配符q10q14q20q25”这个分支的根节点,也是“第一通配符q10q14q20q26”这个分支的根节点,也是“第一通配符q10q15q21”这个分支的根节点,也是“第一通配符q11q16q22”这个分支的根节点,第二通配符所在的虚拟根节点是“第二通配符q18q24”这个分支的根节点。
S1022、根据分形树,生成翻译器和配置文件。
基于分形树,可以构造AC自动机,基于AC自动机,生成翻译器和配置文件。在构造AC自动机时,需要设置失败指针,本申请所提供的分形树,失败指针可以指向根节点,也可以指向虚拟根节点,实际应用中,可以根据具体情况具体设置失败指针的指向。为了便于理解,参见图5,图5为本申请提供的一种示例图,从图5中,设置有部分节点的失败指针返回了根节点,设置有部分节点的失败指针返回了虚拟根节点。图5仅仅是一种示例,并不构成对本申请的任何限定,实际应用中,可以根据具体情况具体设置失败指针的返回节点。
可以看到,本申请提供了一种新的分形树结构,在分形树中设置了虚拟根节点。通过设置虚拟根节点,从而将描述文件中的N个规则构造在一棵树上,在生成翻译器时,只需对该分形树遍历一次即可,相比于采用传统的分形树结构,将N个规则构造在多棵树上,生成翻译器时遍历多棵树,本申请所述方法在时间维度和空间维度上具有轻量级的特点。
上述详细阐述了本申请实施例的方法,下面提供本申请实施例的装置。
参见图6,图6为本申请提供的一种翻译器生成装置600的结构示意图,所述装置600可以配置为车载计算平台,也可以配置为本地服务器,例如台式计算机,也可以配置为云端服务器,例如中心服务器、边缘服务器,也可以配置为虚拟机或容器。装置600包括:
获取模块610,用于获取描述文件,描述文件定义了第一DSL中涉及的关键字和生成翻译器的N个规则,N大于等于1,描述文件是基于配置得到的,N个规则中的第一规则包括多个目标字符串和至少一个通配符,第一规则为N个规则中的任意一个;
生成模块620,用于基于描述文件,生成第一DSL对应的翻译器以及配置文件,配置文件包括N个规则。
在可能的实现方式中,生成模块620用于:根据N个规则生成分形树,N个规则中的每个规则为分形树的一个分支,且第一规则中的至少一个通配符为第一规则所在的分支上的虚拟根节点,虚拟根节点与分形树的根节点不同;根据分形树生成翻译器以及配置文件。
在可能的实现方式中,关键字包括描述文件的名称,描述文件的导入规则,翻译器的类型,翻译器所使用的语言中的一种或多种。
在可能的实现方式中,N个规则中包括第二DSL的识别规则,获取模块620还用于,获取用户输入的待翻译文本,待翻译文本是通过第一DSL编写的或者是通过第二DSL编写的;装置600还包括:翻译模块630,用于通过翻译器,将待翻译文本翻译为对应的语法树。
在可能的实现方式中,翻译模块630用于:将配置文件输入翻译器;通过翻译器,将待翻译文本翻译为对应的语法树。
在可能的实现方式中,装置600还包括:更新模块640,用于将配置文件中的N个规则更新为M个规则,M个规则中的至少一个规则不属于N个规则。
在可能的实现方式中,第二DSL包括DBC描述语言、CMake脚本语言、GCOV描述语言中的任意一种。
图6中的各个功能模块用于实现图1至图5方法实施例所述的方法,具体可参考图1至图5中方法实施例具体内容的描述,为了说明书的简洁,在此不再赘述。
可以理解,图6中各个功能模块的划分及各个功能模块对应执行的步骤仅仅是一种示例,在其他实施例中,装置600还可以根据具体执行步骤划分为更多或更少的功能模块。
参见图7,图7为本申请实施例提供的一种翻译器生成设备700的结构示意图,翻译器生成设备700可以为车载计算平台,也可以为本地服务器或云端服务器,例如,台式计算机、边缘服务器、中心服务器等,也可以为虚拟机或容器。
翻译器生成设备700包括至少一个处理器701和通信接口703,可选的,还包括存储器702,处理器701、存储器702和通信接口703通过总线704相互连接。
存储器702包括但不限于是随机存储记忆体(random access memory,RAM)、只读存储器(read-only memory,ROM)、可擦除可编程只读存储器(erasable programmable read only memory,EPROM)、或便携式只读存储器(compact disc read-only memory,CD-ROM),该 存储器702用于相关计算机程序及数据。通信接口703用于接收和发送数据。
该翻译器生成设备700中的处理器701用于读取存储器702中存储的计算机程序代码,执行以下操作:
获取描述文件,描述文件定义了第一DSL中涉及的关键字和生成翻译器的N个规则,N大于等于1,描述文件是基于配置得到的,N个规则中的第一规则包括多个目标字符串和至少一个通配符,第一规则为N个规则中的任意一个;基于描述文件,生成第一DSL对应的翻译器以及配置文件,配置文件包括N个规则。
在可能的实现方式中,基于描述文件,生成第一DSL对应的翻译器以及配置文件,包括:根据N个规则生成分形树,N个规则中的每个规则为分形树的一个分支,且第一规则中的至少一个通配符为第一规则所在的分支上的虚拟根节点,虚拟根节点与分形树的根节点不同;根据分形树生成翻译器以及配置文件。
在可能的实现方式中,N个规则中包括第二DSL的识别规则,方法还包括:获取用户输入的待翻译文本,待翻译文本是通过第一DSL编写的或者是通过第二DSL编写的;通过翻译器,将待翻译文本翻译为对应的语法树。
在可能的实现方式中,通过翻译器,将待翻译文本翻译为对应的语法树,包括:将配置文件输入翻译器;通过翻译器,将待翻译文本翻译为对应的语法树。
在可能的实现方式中,将配置文件输入翻译器之前,方法还包括:将配置文件中的N个规则更新为M个规则,M个规则中的至少一个规则不属于N个规则,或者N个规则中的至少一个规则不属于M个规则。
在可能的实现方式中,第二DSL包括DBC描述语言、CMake脚本语言、GCOV描述语言中的任意一种。
各个操作的实现及有益效果可以参考图1至图5方法实施例的相应描述。
可以理解的是,本申请实施例中的处理器701可以是中央处理单元(central processing unit,CPU),还可以是其它通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或者其它可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。通用处理器可以是微处理器,也可以是任何常规的处理器。处理器701是CPU时,该CPU可以是单核CPU,也可以是多核CPU。
本申请还提供了一种存储介质,包括程序指令,当程序指令被处理器执行时,使得处理器执行上述翻译器生成方法。
本申请还提供了一种计算机程序产品,所述计算机程序产品可以是包含指令的,能够运行在计算设备上或被储存在任何可用介质中的软件或程序产品。当所述计算机程序产品在处理器上运行时,使得处理器执行上述翻译器生成方法。
本申请的实施例中的方法步骤可以通过硬件的方式来实现,也可以由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器、闪存、只读存储器、可编程只读存储器、可擦除可编程只读存储器、电可擦除可编程只读存储器、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质 可以位于ASIC中。
在本申请的各个实施例中,如果没有特殊说明以及逻辑冲突,不同的实施例之间的术语和/或描述具有一致性、且可以相互引用,不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。
本申请中术语“包括”或“具有”及其任何变形,意图在于覆盖不排他的包括,例如,包括了一系列步骤的过程/方法,或一系列单元的系统/产品/设备,不必限于清楚地列出的那些步骤或单元,而是可以包括没有清楚地列出的或对于这些过程/方法/产品/设备固有的其它步骤或单元。

Claims (17)

  1. 一种翻译器生成方法,其特征在于,包括:
    获取描述文件,所述描述文件定义了第一领域特定语言DSL中涉及的关键字和生成翻译器的N个规则,N大于或等于1,所述描述文件是基于配置得到的,所述N个规则中的第一规则包括多个目标字符串和至少一个通配符,所述第一规则为所述N个规则中的任意一个;
    基于所述描述文件,生成所述第一DSL对应的翻译器以及配置文件,所述配置文件包括所述N个规则。
  2. 根据权利要求1所述的方法,其特征在于,所述基于所述描述文件,生成所述第一领域特定语言DSL对应的翻译器以及配置文件,包括:
    根据所述N个规则生成分形树,所述N个规则中的每个规则为所述分形树的一个分支,且所述第一规则中的至少一个通配符为所述第一规则所在的分支上的虚拟根节点,所述虚拟根节点与所述分形树的根节点不同;
    根据所述分形树生成所述翻译器以及所述配置文件。
  3. 根据权利要求1或2所述的方法,其特征在于,所述关键字包括所述描述文件的名称,所述描述文件的导入规则,所述翻译器的类型,所述翻译器所使用的语言中的一种或多种。
  4. 根据权利要求1-3中任意一项所述的方法,其特征在于,所述N个规则中包括第二DSL的识别规则,所述方法还包括:
    获取用户输入的待翻译文本,所述待翻译文本是通过所述第一DSL编写的或者是通过所述第二DSL编写的;
    通过所述翻译器,将所述待翻译文本翻译为对应的语法树。
  5. 根据权利要求4所述的方法,其特征在于,所述通过所述翻译器,将所述待翻译文本翻译为对应的语法树,包括:
    将所述配置文件输入所述翻译器;
    通过所述翻译器,将所述待翻译文本翻译为对应的语法树。
  6. 根据权利要求5所述的方法,其特征在于,将所述配置文件输入所述翻译器之前,所述方法还包括:
    将所述配置文件中的N个规则更新为M个规则,所述M个规则中的至少一个规则不属于所述N个规则,或者所述N个规则中的至少一个规则不属于所述M个规则。
  7. 根据权利要求4-6中任意一项所述的方法,其特征在于,所述第二DSL包括DBC描述语言、CMake脚本语言、GCOV描述语言中的任意一种。
  8. 一种翻译器生成装置,其特征在于,包括:
    获取模块,用于获取描述文件,所述描述文件定义了第一领域特定语言DSL中涉及的关键字和生成翻译器的N个规则,N大于等于1,所述描述文件是基于配置得到的,所述N个规则中的第一规则包括多个目标字符串和至少一个通配符,所述第一规则为所述N个规则中的 任意一个;
    生成模块,用于基于所述描述文件,生成所述第一DSL对应的翻译器以及配置文件,所述配置文件包括所述N个规则。
  9. 根据权利要求8所述的装置,其特征在于,所述生成模块用于:
    根据所述N个规则生成分形树,所述N个规则中的每个规则为所述分形树的一个分支,且所述第一规则中的至少一个通配符为所述第一规则所在的分支上的虚拟根节点,所述虚拟根节点与所述分形树的根节点不同;
    根据所述分形树生成所述翻译器以及所述配置文件。
  10. 根据权利要求8或9所述的装置,其特征在于,所述关键字包括所述描述文件的名称,所述描述文件的导入规则,所述翻译器的类型,所述翻译器所使用的语言中的一种或多种。
  11. 根据权利要求8-10中任意一项所述的装置,其特征在于,所述N个规则中包括第二DSL的识别规则,所述装置还包括翻译模块,
    所述获取模块还用于,获取用户输入的待翻译文本,所述待翻译文本是通过所述第一DSL编写的或者是通过所述第二DSL编写的;
    所述翻译模块,用于通过所述翻译器,将所述待翻译文本翻译为对应的语法树。
  12. 根据权利要求11所述的装置,其特征在于,所述翻译模块具体用于:
    将所述配置文件输入所述翻译器;
    通过所述翻译器,将所述待翻译文本翻译为对应的语法树。
  13. 根据权利要求12所述的装置,其特征在于,所述装置还包括:
    更新模块,用于将所述配置文件中的N个规则更新为M个规则,所述M个规则中的至少一个规则不属于所述N个规则,或者所述N个规则中的至少一个规则不属于所述M个规则。
  14. 根据权利要求11-13中任意一项所述的装置,其特征在于,所述第二DSL包括DBC描述语言、CMake脚本语言、GCOV描述语言中的任意一种。
  15. 一种翻译器生成设备,其特征在于,包括处理器和存储器,所述存储器用于存储指令,所述处理器用于执行所述存储器中存储的指令,以执行如权利要求1-7任一项所述的方法。
  16. 根据权利要求15所述的设备,其特征在于,所述设备为车载计算平台或服务器。
  17. 一种存储介质,其特征在于,包括程序指令,当所述程序指令被处理器执行时,使得所述处理器执行如权利要求1-7任一项所述的方法。
PCT/CN2023/072163 2023-01-13 2023-01-13 一种翻译器生成方法、装置、设备及存储介质 Ceased WO2024148612A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP23915381.0A EP4650940A4 (en) 2023-01-13 2023-01-13 METHOD AND APPARATUS FOR GENERATING A TRANSLATOR, AND DEVICE AS WELL AS STORAGE MEDIA
CN202380085841.4A CN120476380A (zh) 2023-01-13 2023-01-13 一种翻译器生成方法、装置、设备及存储介质
PCT/CN2023/072163 WO2024148612A1 (zh) 2023-01-13 2023-01-13 一种翻译器生成方法、装置、设备及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/072163 WO2024148612A1 (zh) 2023-01-13 2023-01-13 一种翻译器生成方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2024148612A1 true WO2024148612A1 (zh) 2024-07-18

Family

ID=91897832

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/072163 Ceased WO2024148612A1 (zh) 2023-01-13 2023-01-13 一种翻译器生成方法、装置、设备及存储介质

Country Status (3)

Country Link
EP (1) EP4650940A4 (zh)
CN (1) CN120476380A (zh)
WO (1) WO2024148612A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040098708A1 (en) * 2002-09-26 2004-05-20 Maiko Taruki Simulator for software development and recording medium having simulation program recorded therein
CN103677952A (zh) * 2013-12-18 2014-03-26 华为技术有限公司 编解码器生成装置及方法
CN107229616A (zh) * 2016-03-25 2017-10-03 阿里巴巴集团控股有限公司 语言识别方法、装置及系统
CN110134397A (zh) * 2019-04-12 2019-08-16 深圳壹账通智能科技有限公司 代码片段翻译方法、装置、计算机设备和存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FI111107B (fi) * 2001-05-15 2003-05-30 Softageneraattori Oy Menetelmä translaattorin kehittämiseksi ja vastaava järjestelmä
LU92071B1 (en) * 2012-09-12 2014-03-13 Univ Luxembourg Computer-implemented method for computer program translation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040098708A1 (en) * 2002-09-26 2004-05-20 Maiko Taruki Simulator for software development and recording medium having simulation program recorded therein
CN103677952A (zh) * 2013-12-18 2014-03-26 华为技术有限公司 编解码器生成装置及方法
CN107229616A (zh) * 2016-03-25 2017-10-03 阿里巴巴集团控股有限公司 语言识别方法、装置及系统
CN110134397A (zh) * 2019-04-12 2019-08-16 深圳壹账通智能科技有限公司 代码片段翻译方法、装置、计算机设备和存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4650940A4 *

Also Published As

Publication number Publication date
EP4650940A4 (en) 2026-03-18
CN120476380A (zh) 2025-08-12
EP4650940A1 (en) 2025-11-19

Similar Documents

Publication Publication Date Title
CN112560100B (zh) 数据脱敏方法及装置、计算机可读存储介质、电子设备
US11334692B2 (en) Extracting a knowledge graph from program source code
CN110502227B (zh) 代码补全的方法及装置、存储介质、电子设备
CN111708539A (zh) 一种应用程序代码转换方法、装置、电子设备和存储介质
CN112988163B (zh) 编程语言智能适配方法、装置、电子设备和介质
CN111767055A (zh) 一种量子程序的编译方法及装置
CN113901083B (zh) 基于多解析器的异构数据源操作资源解析定位方法和设备
CN111880801A (zh) 应用程序动态化方法、装置、电子设备
US11740875B2 (en) Type inference in dynamic languages
US11500619B1 (en) Indexing and accessing source code snippets contained in documents
CN115509514B (zh) 一种前端数据模拟方法、装置、设备及介质
CN108563561B (zh) 一种程序隐性约束提取方法及系统
CN110737431A (zh) 软件开发方法、开发平台、终端设备及存储介质
CN110874350A (zh) 结构化日志数据的加工方法和装置
CN113076733A (zh) 一种文本匹配方法、终端设备及存储介质
CN117472381A (zh) 应用的代码处理方法、装置、设备及存储介质
US20250342191A1 (en) Systems and methods for querying graph databases using natural language queries
EP4650940A1 (en) Translator generation method and apparatus, and device and storage medium
CN114253526B (zh) 在线计价方法、装置、设备及存储介质
KR102117165B1 (ko) 바이너리 분석을 위한 중간 언어 테스트 방법 및 장치
Grigorev et al. String-embedded language support in integrated development environment
Piñeiro et al. Perldoop2: A big data-oriented source-to-source Perl-Java compiler
CN117075912B (zh) 用于程序语言转换的方法、编译方法及相关设备
US12487976B2 (en) Automatically improving data annotations by processing annotation properties and user feedback
CN116560761A (zh) 全局函数的信息获取方法及装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23915381

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202380085841.4

Country of ref document: CN

WWP Wipo information: published in national office

Ref document number: 202380085841.4

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

WWP Wipo information: published in national office

Ref document number: 2023915381

Country of ref document: EP