Not parsing systeam headers by clang

Hello,
I have a program that parses chosen c++ files and builds AST from it.
Im using clang::tooling::buildASTFromCodeWithArgs() method to build the tree.
The problem I have is that if my files that im parsing have

#include

with some system headers, clang tries to parse/analyse and build ast from those libraries.
Is there any way I could turn OFF analysing those system header includes, or at least includes that I will specify before ?

Compilation arguments like -nostdinc, -nostdlib, etc. may help. But if your program depends on the types defined in those system headers, you will get compilation errors and the resulted AST is incomplete due to the missing types.

The ASTBuilderAction builds an AST for the whole translation unit. If you don’t want to deal with those types or decls defined in the system headers, you could filter them out when you’re traversing the AST.

1 Like

I would like to use types and methods defined in system headers, but don’t want to waste clangs’ time to analyse and build AST from them. Is there any way to do that?

If theres not an option like that and clang has to parse these headers to use them in my files later, how could I filter it out so it wont be checked while traversing the AST?

I don’t know. But you could also try PCH:

https://clang.llvm.org/docs/PCHInternals.html

You could firstly query the SourceLocation of the current decl when traversing those top-level decls and then use clang::SourceManager::isInSystemHeader to check whether that decl is defined in a system header or not.

1 Like

Okay, so I’ve been trying some things recently and:

  1. After checking decls with isInSystemHeader method it works fine without any errors but clang takes time to analyse system headers but I want to avoid that

  2. Im executing given line in cmd
    clang -cc1 -nostdinc++ -ast-dump ATSTESTCLANG.cpp

and yet it prints error

do you have any idea how I could do that?

I’m wondering how much time clang takes for analyzing those system headers your code depend on and why it’s important to avoid. Have you tried the precompiled headers approach I mentioned above? Basically, you could parse those system headers that are needed in advance, so clang can directly load the AST when analyzing your code.

-nostdinc++ means parsing your code without including C++ system headers dirs. If your code used some types in system headers, you should not use this flag.

For example is analysed for like 10 seconds and I need analysing as fast as possible in my application.

Im not quite sure how to use this pch option. Should I only use pch-emit then pch-include for my .h file with precompiled headers as arguments in buildASTFromCodeWithArgs method? And should this pch file be located in same directory as files im parsing? Also should I tell clang somehow to base on these precomipled headers? Could you please explain this a bit more step by step?

You could put those system headers in a header file, say, precompiled.h:

#include <string>
#include <iostream>
#include <...>
#include <...>

and then generate the corresponding pch files by running, for example:

clang -x c++-header precompiled.h -Xclang -emit-pch -o precompiled.h.pch

In the source file, use it like this:

#include "precompiled.h"

and append the argument list passed to buildASTFromCodeWithArgs with “-include-pch precompiled.h.pch”.

The logic here is exactly the same as including a header file, which means you may need to add -Iinclude-dir-to-the-pch-file to the argument list.

Did you skip those top-level decls in the system headers when you were doing the analysis? I don’t think that merely building an AST from a few of system headers would take so much time. Note that, the pch way can only alleviate the time-consuming process of building the AST, not the analysis part which is depends on your own.

  1. My code looks like this:
//Generate AST tree
      std::unique_ptr<clang::ASTUnit> ast{
         clang::tooling::buildASTFromCodeWithArgs(fileContent, args, file, "clang",
                                                  std::make_shared<PCHContainerOperations>(),
                                                  tooling::getClangStripDependencyFileAdjuster(),
                                                  tooling::FileContentMappings(),
                                                  diagnosticConsumer)
      };
      //Check diagnostic object - here will be stored errors if any.
      if(ast->getDiagnostics().hasErrorOccurred() || ast->getDiagnostics().hasFatalErrorOccurred())
      {
         bool hasToContinue = false;
         for(auto it = diagnosticConsumer->err_begin(); it != diagnosticConsumer->err_end(); ++it)
         {
            //We gathered errors only from processing file - errors from headers from system are skipped.
            if(false == ast->getSourceManager().isInSystemHeader(it->first))
            {
               result.errors.push_back({ it->second + " " + it->first.printToString(ast->getSourceManager()) });
               hasToContinue = true;
            }
         }
         //Clear diagnostic consumer
         diagnosticConsumer->clear();
         diagnosticConsumer->get
         //If there are some errors, skip file processing
         if(true == hasToContinue)
         {
            continue;
         }
      }
      ClassMatcher classMatcher{ result };
      FunctionMatcher functionMatcher{ result };
      GlobalsMatcher globalsMatcher{ result };
      MethodMatcher methodMatcher{ result };


      MatchFinder finder;
      finder.addMatcher(classDefinitionMatcher, &classMatcher);
      finder.addMatcher(globalFunctionDefinitionMatcher, &functionMatcher);
      finder.addMatcher(globalVarsDefinitionMatcher, &globalsMatcher);
      finder.addMatcher(classMethodDefinitionMatcher, &methodMatcher);


      //start analysis
      finder.matchAST(ast->getASTContext());

then in matchers im doing this

const auto& clangGlobalFunctionDef = result.Nodes.getNodeAs<clang::FunctionDecl>("globalFunction");
   //It check if function is global function (function inside namespace will be parsed too)
   if(result.SourceManager->isInSystemHeader(clangGlobalFunctionDef->getLocation()) == false)
   {
      //prevent
   }

Is it the right way to skip these top level decls?

  1. Also I tried pch solution exactly as you said and thats the exception im getting

This only skips top-level function decls. I guess you could add a skipSystemDefinitionMatcher to skip all top-level system decls earlier.

finder.addMatcher(skipSystemDefinitionMatcher, &xxxx);
finder.addMatcher(classDefinitionMatcher, &classMatcher);
finder.addMatcher(globalFunctionDefinitionMatcher, &functionMatcher);
finder.addMatcher(globalVarsDefinitionMatcher, &globalsMatcher);
finder.addMatcher(classMethodDefinitionMatcher, &methodMatcher);

Can’t tell much with so little info. It looks like the diagnostics engine was not correctly initialized. But that should not happen if buildASTFromCodeWithArgs can successfully create the ASTUnit.

Could you check and valid ast before using it?

1 Like

And Could you please explain how should skipSystemDefinitionMatcher be defined?
Like I have classDefinitionMatcher defined like that:

And also what should my class skipMatcher contain in run method?
Thats the example code for my class method matcher in run method:

void MethodMatcher::run(const clang::ast_matchers::MatchFinder::MatchResult& result)
{
   const auto& methodDef = result.Nodes.getNodeAs<clang::CXXMethodDecl>("classMethod");
   if(result.SourceManager->isInSystemHeader(methodDef->getLocation()) == false)
   {

      if(nullptr != methodDef && true == methodDef->hasBody())
      {
         //Get printing policy to get proper type names
         this->pp = std::make_unique<clang::PrintingPolicy>(result.Context->getLangOpts());

         //Create object with data
         auto methodDefinition = std::make_unique<Method>();

         methodDefinition->className = methodDef->getParent()->getName().str();
         methodDefinition->classNameLoc = methodDef->getParent()->getLocation().printToString(*result.SourceManager);
         methodDefinition->name = methodDef->getIdentifier() ? methodDef->getName().str() : methodDef->getNameAsString();
         methodDefinition->lineBegin = methodDef->getBody()->getSourceRange().getBegin().printToString(*result.SourceManager);
         methodDefinition->lineEnd = methodDef->getBody()->getSourceRange().getEnd().printToString(*result.SourceManager);
         methodDefinition->location = methodDef->getLocation().printToString(*result.SourceManager);

         this->analyseResult.classMethods.push_back(std::move(methodDefinition));
      }
   }
}