Software & Apps

microsoft/multilspy: multispy is an lsp client library in Python intended for use in building applications around language servers.

This repository hosts multilspy, a library developed as part of the research done for the NeruIPS 2023 paper titled. “Monitoring Guided Decoding of Code LMs with Static Analysis in Repository Context” (“Guide Language Models to Code with Global Context using Monitors” in Arxiv). The paper introduces Monitor-Guided Decoding (MGD) for code generation using Language Models, where a monitor uses static analysis to guide decoding, ensuring that the generated code follows various properties of correctness, such as the absence of names of hallucinated symbols, valid sequence of method calls. , and so on. For more details about Monitor-Guided Decoding, please see the paper and GitHub repository microsoft/monitors4codegen.

multilspy a cross-platform library designed to simplify the process of creating language server clients to query and get results of various static analyzes from various communicating language servers on Language Server Protocol. It can easily be expanded to support anything language with Language Server and currently supports Java, Rust, C# and Python. We aim to continuously add support for more language servers and languages.

Language servers are tools that perform various static analyzes of code repositories and provide useful information such as type-oriented code completion suggestions, symbol definition locations, symbol references, etc., above Language Server Protocol (LSP). Since LSP is language-agnostic, multilspy can provide the results for static code analysis in different languages ​​in a common interface.

multilspy seeks to ease the process of using language servers, by managing the various steps in using a language server:

  • Automatically handle the download of platform-specific server binaries, and setup/disable language servers
  • Handle JSON-RPC based communication between the client and the server
  • Maintain and forward hand-tuned server and language specific configuration parameters
  • Provide a simple API to the user, while implementing all the server-specific protocol steps to execute the query/request.

Some of the analyzes bear that out multilspy can provide is:

It is good to create a new virtual environment with python>=3.10. To create a virtual environment using conda and activate it:

conda create -n multilspy_env python=3.10
conda activate multilspy_env

More details and instructions on creating a python virtual environment can be found at official documentation. In addition, we also target users Minicondaas an alternative to the above steps for creating the virtual environment.

To install multilspy using pip, execute the following command:

Usage example:

from multilspy import SyncLanguageServer
from multilspy.multilspy_config import MultilspyConfig
from multilspy.multilspy_logger import MultilspyLogger
...
config = MultilspyConfig.from_dict({"code_language": "java"}) # Also supports "python", "rust", "csharp"
logger = MultilspyLogger()
lsp = SyncLanguageServer.create(config, logger, "/abs/path/to/project/root/")
with lsp.start_server():
    result = lsp.request_definition(
        "relative/path/to/code_file.java", # Filename of location where request is being made
        163, # line number of symbol for which request is being made
        4 # column number of symbol for which request is being made
    )
    result2 = lsp.request_completions(
        ...
    )
    result3 = lsp.request_references(
        ...
    )
    result4 = lsp.request_document_symbols(
        ...
    )
    result5 = lsp.request_hover(
        ...
    )
    ...

multilspy also provides an asyncio based API that can be used in async context. Usage example (async):

from multilspy import LanguageServer
...
lsp = LanguageServer.create(...)
async with lsp.start_server():
    result = await lsp.request_definition(
        ...
    )
    ...

The file src/multilspy/language_server.py gives the multilspy API. Many tests for multilspy is below tests/multilspy/ provide detailed usage examples for multilspy. Tests can be executed by running:

Use of multilspy in AI4Code Scenarios such as Monitor-Guided Decoding

multilspy provides all the features provided by language-server-protocol in IDEs like VSCode. It is useful to develop toolsets that can interface with AI systems such as Large Language Models (LLM). One such usecase is Monitor-Guided Decoding, where multilspy used to find the results of static analysis such as type-directed completions, to guide token-by-token generation of code using LLM, ensuring that all generated identifiers/method names are valid in context in the repository, which increases compilability. in the generated code. MGD also shows the use of multilspy to create monitors that ensure that all function calls in LLM generated code receive the correct number of arguments, and that an object’s functions are called in the correct order in a protocol (such as not calling “read” before “opening” a file object).

Frequently Asked Questions (FAQ)

asyncio related Runtime error when executing tests for MGD

If you get the following error:

cb=(_chain_future.._call_set_state() in python3.8/asyncio/futures.py:367)> got Future enclosed in another loop python3.8/asyncio/locks.py:309: RuntimeError”>

RuntimeError: Task  cb=(_chain_future.._call_set_state() at
    python3.8/asyncio/futures.py:367)> got Future  attached to a different loop python3.8/asyncio/locks.py:309: RuntimeError

Please make sure you create a new environment using Python >=3.10. For more details, please see Discussion on StackOverflow.

If you are using Multilspy in your research or applications, please cite using this BibTeX:

@inproceedings{NEURIPS2023_662b1774,
 author = {Agrawal, Lakshya A and Kanade, Aditya and Goyal, Navin and Lahiri, Shuvendu and Rajamani, Sriram},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {A. Oh and T. Naumann and A. Globerson and K. Saenko and M. Hardt and S. Levine},
 pages = {32270--32298},
 publisher = {Curran Associates, Inc.},
 title = {Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context},
 url = {https://proceedings.neurips.cc/paper_files/paper/2023/file/662b1774ba8845fc1fa3d1fc0177ceeb-Paper-Conference.pdf},
 volume = {36},
 year = {2023}
}

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) stating that you have the right to, and essentially grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot automatically determines whether you need to issue a CLA and decorates the PR appropriately (eg status check, comment). Just follow the instructions given by the bot. You only need to do this once for all repos using our CLA.

This project adopts the Microsoft Open Source Code of Conduct. For more information see Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must be followed
Microsoft Trademark and Brand Guidelines. Use of Microsoft marks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third party trademarks or logos is subject to the third party’s policies.


https://opengraph.githubassets.com/905272e1623a47b5b4a574d06b19e1175d62a3717ff741f065604115bc81aec2/microsoft/multilspy

2024-12-17 06:43:51

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button