(Translated by https://www.hiragana.jp/)
[CIR][TargetLowering][NFC] Refactor LowerModule to a unified global state by seven-mile · Pull Request #759 · llvm/clangir · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][TargetLowering][NFC] Refactor LowerModule to a unified global state #759

Closed
wants to merge 2 commits into from

Conversation

seven-mile
Copy link
Collaborator

We expect the LowerModule to be available most of the time once CIRGen is complete, providing various support for lowering passes, especially target-specific information. To maintain a consistent state along the call tree and across all libraries, we should add a non-intrusive singleton class.

This PR creates the singleton class LowerModuleRegistry to enable easy access to a lazily initialized LowerModule instance. Its header is public for all CIR libraries, but it does not rely on other internal headers in the TargetLowering library.

Before this PR, the initialization process was as follows:

  • In CallConvLoweringPass, a new LowerModule is created for each function.
  • In LowerToLLVMPass, a new LowerModule is created exactly once.

With this PR, each user initializes LowerModule if it is not yet initialized. Thus, there are three possible initialization points:

  • CIRGenConsumer::HandleTranslationUnit, if called from the Clang driver.
  • CallConvLoweringPass::runOnOperation, if CallConvLoweringPass is enabled by cir-opt.
  • LowerToLLVMPass, for cir-opt and cir-translate.

i.e. in any case, LowerModule is created on demand only once. If we uses cir-opt without CallConvLowering pass, the registry will keep uninitialized till the end.

namespace mlir {
namespace cir {

void LowerModuleRegistry::initializeWithModule(ModuleOp module) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrated from the removed function createLowerModule.


/// Registry for the LowerModule, enabling easy access to the LowerModule from
/// various libraries.
class LowerModuleRegistry {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the naming here. Advice welcome.

@jopperm
Copy link
Contributor

jopperm commented Jul 29, 2024

Two high-level questions:

  • Is this safe to use this together with MLIR's multithreading?
  • Could there be a situation in which a single compiler instance would need to handle multiple modules (i.e. is there a need for re-initialization of the singleton object)?

@seven-mile
Copy link
Collaborator Author

seven-mile commented Jul 29, 2024

Is this safe to use this together with MLIR's multithreading?

No. Because I think it generally makes no sense to support "initialize a global state with multi threading". The caller should avoid that kind of abuse😂. For implicit multi threading, MLIR multi threading generally happens in matchAndRewrite etc. I think it won't be error-prone.

Could there be a situation in which a single compiler instance would need to handle multiple modules (i.e. is there a need for re-initialization of the singleton object)?

I think possibly Yes for the future, but Not Yet for now (?). Do we have a way to explicitly reject such situation? Maybe add an
“exactly once” check to the constructor of CIRGenConsumer?

@seven-mile
Copy link
Collaborator Author

Let me explain why we need such a (seemingly hacky) global state rather than passing the instance around.

Intrusive changes

We have 3 producers (CIRGen, cir-opt and cir-translate) and 2 consumers (CallConvLowering and CIRToLLVMLowering) with a deep call stack. It just involves too many changes. This makes passing the instance around not quite realistic.

Implementation hiding

Our goal is to make LowerModule accessible to all the libraries (like CIRGen). Some of them (producers) just want to call its ctor, others (consumers) would like to do much more. The consumers are okay after #745. But for the easier problem of producers, std::unique_ptr<T> is not enough to hide implementations, because it requires call to dtor in the referential library.

For such problem, we have to either:

  • Expose most headers of Transforms/TargetLowering in include/
  • Or use pImpl idiom to hide its implementation, this is the first version of this PR.
    • To avoid unnecessary indirect calls and intrusive changes, I further simplified it into a container which holds the lifetime of LowerModule instance. Thus we only need to expose the short header of the container class.

And the container of LowerModule is exactly the same thing as the registry, except that the latter also doesn't need a bunch of intrusive changes.

Precedent in MLIR of global state

::mlir::registerPass function does not involve any context. It moves the pass to static ManagedStatic<StringMap<PassInfo>> passRegistry. Later methods like mlir::PassPipelineInfo::lookup(StringRef pipelineArg) queries on the singleton registry.


This is just a simple direct approach to improve the current LowerModule initialization. It has the following drawbacks:

  • Lifetime offset: LowerModule should have the same lifetime as MLIR module. But it's not correctly dealt with now. The lifetime in MLIR is basically managed by MLIR context. If we have a good way to create a singleton bound to MLIR context, that would definitely be a perfect solution. If not, we should still keep the lifetime aligned to MLIR module. I don't have a full plan yet.

  • Design inconsistency: The reference design for this problem in my mind is mlir::DataLayout. Basically we store all the information in the IR itself, and construct a wrapper to query on it at any time we want at a not-quite-high cost, which provides portability/flexibility. At the basis of that, we treat reusing the instance locally as an optimization, rather than a requirement. We can of course give a LowerModuleRegistry for LowerModule, but then what about our cir::DataLayout?

That's my food for better ideas, suggestions welcome and appreciated!

@bcardosolopes
Copy link
Member

Hi, thanks for improving this and for the detailed explanation.

The managed static also makes me a bit scared. I was thinking something way more direct:

  • No registry. Keep LowerModule stuff as is.
  • std::unique_ptr<LowerModule> should be a class member in
    CIRGenConsumer, CallConvLoweringPass, LowerToLLVMPass. Whenever you can pass it around, you std::move it. If you can't make it a class member for any reason, you create an RAII simple struct (e.g. LexicalScope) that handles it's construction and destruction.

@seven-mile
Copy link
Collaborator Author

Hi, thanks for improving this and for the detailed explanation.

The managed static also makes me a bit scared. I was thinking something way more direct:

  • No registry. Keep LowerModule stuff as is.
  • std::unique_ptr<LowerModule> should be a class member in
    CIRGenConsumer, CallConvLoweringPass, LowerToLLVMPass. Whenever you can pass it around, you std::move it. If you can't make it a class member for any reason, you create an RAII simple struct (e.g. LexicalScope) that handles it's construction and destruction.

I think I had a patch of ~600 lines roughly implementing it. LowerModuleContainer is the RAII container with the lazy initialization capability. std::shared_ptr<LowerModuleContainer> is required to simplify the handling of the multi-producer-multi-consumer situation (e.g. invoke CallConvLowering then LowerToLLVM).

The patch outlines the long call stack... Is it viable?

@bcardosolopes
Copy link
Member

Before you write more code, let's discuss a bit more.

multi-producer-multi-consumer situation (e.g. invoke CallConvLowering then LowerToLLVM).

I'm not sure I understand this situation, can you elaborate on why can't CallConvLowering yield ownership before things go down to LowerToLLVM?

@seven-mile
Copy link
Collaborator Author

seven-mile commented Aug 3, 2024

When a pass pipeline constructed by cir-opt includes both CallConvLowering and LowerToLLVM, the ownership of the pass is held by the pass registry. We cannot easily opt in to transfer the instance of LowerModule.

(Of course, CallConvLowering is not yet registered for cir-opt for now.)

@bcardosolopes
Copy link
Member

I changed my mind here, as long as we keep holding one per module the price shouldn't be too high - the complexity tradeoff doesn't seem worth it. Thanks for putting the work on putting this PR, let's followup on your other work. We can revisit this if performance become an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants