intelmq.bots.parsers.generic package¶
Submodules¶
intelmq.bots.parsers.generic.parser_csv module¶
Generic CSV parser
Parameters: columns: string delimiter: string default_url_protocol: string skip_header: boolean type: string type_translation: string data_type: string
-
intelmq.bots.parsers.generic.parser_csv.BOT¶ alias of
intelmq.bots.parsers.generic.parser_csv.GenericCsvParserBot
-
class
intelmq.bots.parsers.generic.parser_csv.GenericCsvParserBot(bot_id: str, start: bool = False, sighup_event=None, disable_multithreading: Optional[bool] = None)¶ Bases:
intelmq.lib.bot.ParserBotParse generic CSV data. Ignoring lines starting with character #. URLs without protocol can be prefixed with a default value.
-
column_regex_search: Optional[dict] = None¶
-
columns: Union[str, Iterable] = None¶
-
columns_required: Optional[dict] = None¶
-
compose_fields: Optional[dict] = {}¶
-
data_type: Optional[dict] = None¶
-
default_url_protocol: str = ''¶
-
delimiter: str = ''¶
-
filter_text= None¶
-
filter_type= None¶
-
init()¶
-
parse(report)¶ A generator yielding the single elements of the data.
Comments, headers etc. can be processed here. Data needed by self.parse_line can be saved in self.tempdata (list).
Default parser yields stripped lines. Override for your use or use an existing parser, e.g.:
parse = ParserBot.parse_csv
- You should do that for recovering lines too.
recover_line = ParserBot.recover_line_csv
-
parse_line(row, report)¶ A generator which can yield one or more messages contained in line.
Report has the full message, thus you can access some metadata. Override for your use.
-
recover_line(line: str)¶ Reverse of “parse” for single lines.
Recovers a fully functional report with only the problematic line by concatenating all strings in “self.tempdata” with “line” with LF newlines. Works fine for most text files.
- lineOptional[str], optional
The currently process line which should be transferred into it’s original appearance. As fallback, “self.current_line” is used if available (depending on self.parse). The default is None.
- ValueError
If neither the parameter “line” nor the member “self.current_line” is available.
- str
The reconstructed raw data.
-
skip_header: bool = False¶
-
time_format= None¶
-
type: Optional[str] = None¶
-
type_translation= {}¶
-