Initial Commit

This commit is contained in:
root
2017-02-25 23:55:24 +01:00
commit 1fe2e8ab62
4868 changed files with 1487355 additions and 0 deletions

View File

@@ -0,0 +1,399 @@
# A quick word
nxapi/nxtool is the new learning tool, that attempts to perform the following :
* Events import : Importing naxsi events into an elasticsearch database
* Whitelist generation : Generate whitelists, from templates rather than from purely statistical aspects
* Events management : Allow tagging of events into database to exclude them from wl gen process
* Reporting : Display information about current DB content
# Configuration file : nxapi.json
nxapi uses a JSON file for its settings, such as :
$ cat nxapi.json
{
# elasticsearch setup, must point to the right instance.
"elastic" : {
"host" : "127.0.0.1:9200",
"index" : "nxapi",
"doctype" : "events",
"default_ttl" : "7200",
"max_size" : "1000"
},
# filter used for any issued requests, you shouldn't modify it yet
"global_filters" : {
"whitelisted" : "false"
},
# global warning and global success rules, used to distinguish good and 'bad' whitelists
"global_warning_rules" : {
"rule_uri" : [ ">", "5" ],
"rule_var_name" : [ ">", "5" ],
"rule_ip" : ["<=", 10 ],
"global_rule_ip_ratio" : ["<", 5]
},
"global_success_rules" : {
"global_rule_ip_ratio" : [">=", 30],
"rule_ip" : [">=", 10]
},
# path to naxsi core rules, path to template files,
# path to geoloc database.
"naxsi" : {
"rules_path" : "/etc/nginx/naxsi_core.rules",
"template_path" : "tpl/",
"geoipdb_path" : "nx_datas/country2coords.txt"
},
# controls default colors and verbosity behavior
"output" : {
"colors" : "true",
"verbosity" : "5"
}
}
# Prequisites
## Set up ElasticSearch
* Download the archive with the binary files from https://www.elastic.co/downloads/elasticsearch
* Extract the archive
* Start ElasticSearch by executing `bin/elasticsearch` in the extracted folder
* Check whether ElasticSearch is running correctly:
`curl -XGET http://localhost:9200/`
* Add a nxapi index with the following command:
`curl -XPUT 'http://localhost:9200/nxapi/'`
## Populating ElasticSearch with data
* Enable learning mode
* Browse website to generate data in the logfile
* Change into nxapi directory
* Load the data from the log file into ElasticSearch with the following command:
`./nxtool.py -c nxapi.json --files=/PATH/TO/LOGFILE.LOG`
* Check if data was added correctly:
`curl -XPOST "http://localhost:9200/nxapi/events/_search?pretty" -d '{}' `
* Check if nxtool sees it correctly:
`./nxtool.py -c nxapi.json -x`
# Simple usage approach
##1. Get info about db
$ ./nxtool.py -x --colors -c nxapi.json
Will issue a summary of database content, including :
* Ratio between tagged/untagged events.
Tagging of events is an important notion that allows you to know how well you are doing on learning.
Let's say you just started learning. You will have a tag ratio of 0%, which means you didn't write any
whitelists for recent events. Once you start generating whitelists, you can provide those (`-w /tmp/wl.cf --tag`)
and nxapi will mark those events in the database as whitelisted, excluding them from future generation process.
It allows you to speed up the generation process, but mainly to know how well you dealt with recent false positives.
You can also use the tagging mechanism to exclude obvious attack patterns from learning. If X.Y.Z.W keeps hammering my website and polluting my log, I can provide nxapi with the ip (`-i /tmp/ips.txt --tag`) to tag and exclude them from process.
* Top servers.
A TOP10 list of dst hosts raising the most exceptions.
* Top URI(s).
A TOP10 list of dst URIs raising the most exceptions. It is very useful in combination with --filter to generate whitelists for specific URI(s).
* Top Zones.
List of most active zones of exceptions.
##2. Generate whitelists
Let's say I had the following output :
./nxtool.py -c nxapi.json -x --colors
# Whitelist(ing) ratio :
# false 79.96 % (total:196902/246244)
# true 20.04 % (total:49342/246244)
# Top servers :
# www.x1.fr 21.93 % (total:43181/196915)
# www.x2.fr 15.21 % (total:29945/196915)
...
# Top URI(s) :
# /foo/bar/test 8.55 % (total:16831/196915)
# /user/register 5.62 % (total:11060/196915)
# /index.php/ 4.26 % (total:8385/196915)
...
# Top Zone(s) :
# BODY 41.29 % (total:81309/196924)
# HEADERS 23.2 % (total:45677/196924)
# BODY|NAME 16.88 % (total:33243/196924)
# ARGS 12.47 % (total:24566/196924)
# URL 5.56 % (total:10947/196924)
# ARGS|NAME 0.4 % (total:787/196924)
# FILE_EXT 0.2 % (total:395/196924)
# Top Peer(s) :
# ...
I want to generate whitelists for x1.fr, so I will get more precise statistics first :
./nxtool.py -c nxapi.json -x --colors -s www.x1.fr
...
# Top URI(s) :
# /foo/bar/test 8.55 % (total:16831/196915)
# /index.php/ 4.26 % (total:8385/196915)
...
I will then attempt to generate whitelists for the `/foo/bar/test` page, that seems to trigger most events :
`Take note of the --filter option, that allows me to work whitelists only for this URI.
Filters can specify any field : var_name, zone, uri, id, whitelisted, content, country, date ...
However, take care, they don't support regexp yet.
Take note as well of --slack usage, that allows to ignore success/warning criterias, as my website has too few
visitors, making legitimate exceptions appear as false positives.`
./nxtool.py -c nxapi.json -s www.x1.fr -f --filter 'uri /foo/bar/test' --slack
...
#msg: A generic whitelist, true for the whole uri
#Rule (1303) html close tag
#total hits 126
#content : lyiuqhfnp,+<a+href="http://preemptivelove.org/">Cialis+forum</a>,+KKSXJyE,+[url=http://preemptivelove.org/]Viagra+or+cialis[/url],+XGRgnjn,+http
#content : 4ThLQ6++<a+href="http://aoeymqcqbdby.com/">aoeymqcqbdby</a>,+[url=http://ndtofuvzhpgq.com/]ndtofuvzhpgq[/url],+[link..
#peers : x.y.z.w
...
#uri : /faq/
#var_name : numcommande
#var_name : comment
...
# success : global_rule_ip_ratio is 58.82
# warnings : rule_ip is 10
BasicRule wl:1303 "mz:$URL:/foo/bar/test|BODY";
nxtool attempts to provide extra information to allow user to decides wether it's a false positive :
* content : actual HTTP content, only present if $naxsi_extensive_log is set to 1
* uri : example(s) of URI on which the event was triggered
* var_name : example(s) of variable names in which the content was triggered
* success and warnings : nxapi will provide you with scoring information (see 'scores').
##3. Interactive whitelist generation
Another way of creating whitelists is to use the -g option. This option provide
an interactive way to generate whitelists. This option use the EDITOR env
variable and uses it to iterate over all the servers available inside your elastic
search instance (if the EDITOR env variable isn't set it will try to use `vi`.
You can either delete or comment with a `#` at the beginning the line you don't
want to keep. After the server selection, it will iterate on each available uri
and zone for earch server. If you want to use regex, only available for uri,
you can add a `?` at the beginning of each line where you want to use a regex:
uri /fr/foo/ ...
?uri /[a-z]{2,}/foo ...
The -g options once all the selection is done, will attempt to generate the wl
with the same behaviour as -f option, and write the result inside the path the
typical output when generating wl is:
generating wl with filters {u'whitelisted': u'false', 'uri': '/fr/foo', 'server': 'x.com'}
Writing in file: /tmp/server_x.com_0.wl
As you can see you'll see each filter and each file for each selections.
##4. Tagging events
Once I chose the whitelists that I think are appropriate, I will write them in a whitelist file.
Then, I can tag corresponding events :
nxtool.py -c nxapi.json -w /tmp/whitelist.conf --tag
And then, if I look at the report again, I will see a bump in the tagged ratio of events.
Once the ratio is high enough or the most active URLs & IPs are false positives, it's done!
# Tips and tricks for whitelist generation
* `--filter`
--filter is your friend, especially if you have a lot of exceptions.
By narrowing the search field for whitelists, it will increase speed, and reduce false positives.
* use `-t` instead of `-f`
-f is the "dumb" generation mode, where all templates will be attempted.
if you provide something like `-t "ARGS/*"` only templates specific to ARGS whitelists will be attempted.
* Create your own templates
If you manage applications that do share code/framework/technology, you will quickly find yourself
generating the same wl again and again. Stop that! Write your own templates, improving generation time,
accuracy and reducing false positives. Take a practical example:
I'm dealing with magento, like a *lot*. One of the recurring patterns is the "onepage" checkout, so I created specific templates:
{
"_success" : { "rule_ip" : [ ">", "1"]},
"_msg" : "Magento checkout page (BODY|NAME)",
"?uri" : "/checkout/onepage/.*",
"zone" : "BODY|NAME",
"id" : "1310 OR 1311"
}
# Supported options
## Scope/Filtering options
`-s SERVER, --server=SERVER`
Restrict context of whitelist generation or stats display to specific FQDN.
`--filter=FILTER`
A filter (in the form of a dict) to merge with
existing templates/filters: 'uri /foobar zone BODY'.
You can combine several filters, for example : `--filter "country FR" --filter "uri /foobar"`.
## Whitelist generation options
`-t TEMPLATE, --template=TEMPLATE`
Given a path to a template file, attempt to generate matching whitelists.
Possible whitelists will be tested versus database, only the ones with "good" scores will be kept.
if TEMPLATE starts with a '/' it's treated as an absolute path. Else, it's expanded starting in tpl/ directory.
`-f, --full-auto`
Attempts whitelist generation for all templates present in rules_path.
`--slack`
Sets nxtool to ignore scores and display all generated whitelists.
## Tagging options
`-w WL_FILE, --whitelist-path=WL_FILE`
Given a whitelist file, finds matching events in database.
`-i IPS, --ip-path=IPS`
Given a list of ips (separatated by \n), finds matching events in database.
`--tag`
Performs tagging. If not specified, matching events are simply displayed.
## Statistics generation options
`-x, --stats`
Generate statistics about current database.
## Importing data
**Note:** All acquisition features expect naxsi EXLOG/FMT content.
` --files=FILES_IN Path to log files to parse.̀`
Supports glob, gz bz2, ie. --files "/var/log/nginx/*mysite.com*error.log*"
`--fifo=FIFO_IN Path to a FIFO to be created & read from. [infinite]`
Creates a FIFO, increases F_SETPIPE_SZ, and reads on it. mostly useful for reading directly from syslog/nginx logs.
`--stdin Read from stdin.`
`--no-timeout Disable timeout on read operations (stdin/fifo).̀
# Understanding templates
Templates do have a central role within nxapi.
By default only generic ones are provided, you should create your own.
First, look at a generic one to understand how it works :
{
"zone" : "HEADERS",
"var_name" : "cookie",
"id" : "?"
}
Here is how nxtool will use this to generate whitelists:
1. extract global_filters from nxapi.json, and create the base ES filter :
{ "whitelisted" : "false" }
2. merge base ES filter with provided cmd line filter (--filter, -s www.x1.fr)
{ "whitelisted" : "false", "server" : "www.x1.fr" }
3. For each static field of the template, merge it in base ES filter :
{ "whitelisted" : "false", "server" : "www.x1.fr", "zone" : "HEADERS", "var_name" : "cookie" }
4. For each field to be expanded (value is `?`) :
4.1. select all possible values for this field (id) matching base ES filter, (ie. 1000 and 1001 here)
4.2. attempt to generate a whitelist for each possible value, and evaluate its scores.
{ "whitelisted" : "false", "server" : "www.x1.fr", "zone" : "HEADERS", "var_name" : "cookie", "id" : "1000"}
{ "whitelisted" : "false", "server" : "www.x1.fr", "zone" : "HEADERS", "var_name" : "cookie", "id" : "1001"}
5. For each final set that provided results, output a whitelist.
Templates support :
* `"field" : "value"` : A static value that must be present in exception for template to be true.
* `"field" : "?"` : A value that must be expanded from database content (while matching static&global filters).
unique values for "field" will then be used for whitelist generation (one whitelist per unique value).
* `"?field" : "regexp"` : A regular expression for a field content that will be searched in database.
unique values matching regexp for "field" will then be used for whitelist generation (one whitelist per unique value).
* `"_statics" : { "field" : "value" }` : A static value to be used at whitelist generation time. Does not take part in search process,
only at 'output' time. ie. `"_statics" : { "id" : "0" }` is the only way to have a whitelist outputing a 'wl:0'.
* `"_msg" : "string" ` : a text message to help the user understand the template purpose.
* `"_success" : { ... }` : A dict supplied to overwrite/complete 'global' scoring rules.
* `"_warnings" : { ... }` : A dict supplied to overwrite/complete 'global' scoring rules.
# Understanding scoring
Scoring mechanism :
* Scoring mechanism is a very trivial approach, relying on three kinds of "scoring" expressions : _success, _warning, _deny.
* Whenever a _success rule is met while generating a whitelist, it will INCREASE the "score" of the whitelist by 1.
* Whenever a _warning rule is met while generating a whitelist, it will DECREASE the "score" of the whitelist by 1.
* Whenever a _deny rule is met while generating a whitelist, it will disable the whitelist output.
_note:_
In order to understand scoring mechanism, it is crucial to tell the difference between a template and a rule.
A template is a .json file which can match many events. A rule is usually a subpart of a template results.
For example, if we have this data :
[ {"id" : 1, "zone" : HEADERS, ip:A.A.A.A},
{"id" : 2, "zone" : HEADERS, ip:A.A.A.A},
{"id" : 1, "zone" : ARGS, ip:A.B.C.D}
]
And this template :
{"id" : 1, "zone" : "?"}
Well, template_ip would be 2, as 2 peers triggered events with ID:1.
However, rule_ip would be 1, as the two generated rules ('id:1 mz:ARGS' and 'id:1 mz:HEADERS'),
were triggered each by one unique peer.
If --slack is present, scoring is ignored, and all possible whitelists are displayed.
In normal conditions, whitelists with more than 0 points are displayed.
The default filters enabled in nxapi, from nxapi.json :
"global_warning_rules" : {
"rule_ip" : ["<=", 10 ],
"global_rule_ip_ratio" : ["<", 5]
},
"global_success_rules" : {
"global_rule_ip_ratio" : [">=", 10],
"rule_ip" : [">=", 10]
},
"global_deny_rules" : {
"global_rule_ip_ratio" : ["<", 2]
},
* rule_N <= X : "at least" X uniq(N) where present in the specific events from which the WL is generated.
* '"rule_ip" : ["<=", 10 ],' : True if less than 10 unique IPs hit the event
* '"rule_var_name" : [ "<=", "5" ]' : True if less than 5 unique variable names hit the event
* template_N <= X : "at least" X uniq(N) where present in the specific events from which the WL is generated.
* Note the difference with "rule_X" rules.
* global_rule_ip_ratio < X : "at least" X% of the users that triggered events triggered this one as well.
* however, ration can theorically apply to anything, just ip_ratio is the most common.

View File

@@ -0,0 +1,318 @@
{
"title": "naxsi-current+inspect (last 1 hour)",
"services": {
"query": {
"idQueue": [
1,
2,
3,
4
],
"list": {
"0": {
"id": 0,
"type": "topN",
"query": "*",
"alias": "",
"color": "#6ED0E0",
"pin": false,
"enable": true,
"field": "server",
"size": 10,
"union": "AND"
}
},
"ids": [
0
]
},
"filter": {
"idQueue": [
0,
1,
2
],
"list": {
"0": {
"type": "time",
"field": "date",
"from": "now-1h",
"to": "now",
"mandate": "must",
"active": true,
"alias": "",
"id": 0
},
"1": {
"type": "querystring",
"query": "*preprod*",
"mandate": "mustNot",
"active": true,
"alias": "",
"id": 1
}
},
"ids": [
0,
1
]
}
},
"rows": [
{
"title": "events",
"height": "250px",
"editable": true,
"collapse": false,
"collapsable": true,
"panels": [
{
"error": false,
"span": 3,
"editable": true,
"type": "terms",
"loadingEditor": false,
"queries": {
"mode": "all",
"ids": [
0
]
},
"field": "server",
"exclude": [],
"missing": true,
"other": true,
"size": 30,
"order": "count",
"style": {
"font-size": "10pt"
},
"donut": false,
"tilt": false,
"labels": true,
"arrangement": "vertical",
"chart": "bar",
"counter_pos": "below",
"spyable": true,
"title": "sites",
"tmode": "terms",
"tstat": "total",
"valuefield": ""
},
{
"span": 9,
"editable": true,
"type": "histogram",
"loadingEditor": false,
"mode": "count",
"time_field": "date",
"queries": {
"mode": "all",
"ids": [
0
]
},
"value_field": null,
"auto_int": false,
"resolution": 100,
"interval": "1m",
"intervals": [
"auto",
"1s",
"1m",
"5m",
"10m",
"30m",
"1h",
"3h",
"12h",
"1d",
"1w",
"1M",
"1y"
],
"fill": 1,
"linewidth": 3,
"timezone": "browser",
"spyable": true,
"zoomlinks": true,
"bars": true,
"stack": false,
"points": false,
"lines": false,
"legend": true,
"x-axis": true,
"y-axis": true,
"percentage": false,
"interactive": false,
"options": true,
"tooltip": {
"value_type": "individual",
"query_as_alias": true
},
"title": "history",
"scale": 1,
"y_format": "none",
"grid": {
"max": null,
"min": 0
},
"annotate": {
"enable": false,
"query": "*",
"size": 20,
"field": "_type",
"sort": [
"_score",
"desc"
]
},
"pointradius": 5,
"show_query": true,
"legend_counts": true,
"zerofill": true,
"derivative": false
}
],
"notice": false
},
{
"title": "timelines",
"height": "150px",
"editable": true,
"collapse": false,
"collapsable": true,
"panels": [
{
"error": false,
"span": 12,
"editable": true,
"type": "table",
"loadingEditor": false,
"size": 100,
"pages": 5,
"offset": 0,
"sort": [
"date",
"desc"
],
"overflow": "min-height",
"fields": [
"server",
"uri",
"zone",
"var_name",
"ip",
"id",
"content",
"date"
],
"highlight": [
null
],
"sortable": true,
"header": true,
"paging": true,
"field_list": false,
"all_fields": false,
"trimFactor": 300,
"localTime": false,
"timeField": "date",
"spyable": true,
"queries": {
"mode": "all",
"ids": [
0
]
},
"style": {
"font-size": "9pt"
},
"normTimes": true
}
],
"notice": false
}
],
"editable": true,
"failover": false,
"index": {
"interval": "none",
"pattern": "[logstash-]YYYY.MM.DD",
"default": "nxapi",
"warm_fields": true
},
"style": "dark",
"panel_hints": true,
"pulldowns": [
{
"type": "query",
"collapse": true,
"notice": false,
"enable": true,
"query": "*",
"pinned": true,
"history": [
"*",
"www.forum-fic.com"
],
"remember": 10
},
{
"type": "filtering",
"collapse": false,
"notice": true,
"enable": true
}
],
"nav": [
{
"type": "timepicker",
"collapse": false,
"notice": false,
"enable": true,
"status": "Stable",
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
],
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"timefield": "date",
"now": true,
"filter_id": 0
}
],
"loader": {
"save_gist": true,
"save_elasticsearch": true,
"save_local": true,
"save_default": true,
"save_temp": true,
"save_temp_ttl_enable": true,
"save_temp_ttl": "30d",
"load_gist": false,
"load_elasticsearch": true,
"load_elasticsearch_size": 20,
"load_local": false,
"hide": false
},
"refresh": "10s"
}

View File

@@ -0,0 +1,261 @@
AD:42.5462450,1.6015540
AE:23.4240760,53.8478180
AF:33.939110,67.7099530
AG:47.38766640,8.25542950
AI:18.2205540,-63.06861499999999
AL:32.31823140,-86.9022980
AM:-3.41684270,-65.85606460
AN:12.2260790,-69.0600870
AO:47.5162310,14.5500720
AQ:-82.86275189999999,-135.0
AR:35.201050,-91.83183339999999
AS:-14.2709720,-170.1322170
AT:47.5162310,14.5500720
AU:-25.2743980,133.7751360
AW:12.521110,-69.9683380
AX:60.33854850,20.27125850
AZ:34.04892810,-111.09373110
BA:43.9158860,17.6790760
BB:13.1938870,-59.5431980
BD:23.6849940,90.3563310
BE:50.5038870,4.4699360
BF:12.2383330,-1.5615930
BG:42.7338830,25.485830
BH:-19.91906770,-43.93857470
BI:-3.3730560,29.9188860
BJ:9.307689999999999,2.3158340
BM:32.3213840,-64.75736999999999
BN:4.5352770,114.7276690
BO:7.95517910,-11.74099460
BR:-14.2350040,-51.925280
BS:25.034280,-77.39627999999999
BT:27.5141620,90.4336010
BV:47.65806030,-94.87917419999999
BW:-22.3284740,24.6848660
BY:53.7098070,27.9533890
BZ:17.1898770,-88.49764999999999
CA:36.7782610,-119.41793240
CC:-26.58576560,-60.95400730
CD:-4.0383330,21.7586640
CF:6.611110999999999,20.9394440
CG:-0.2280210,15.8276590
CH:46.8181880,8.227511999999999
CI:7.539988999999999,-5.547079999999999
CK:-21.2367360,-159.7776710
CL:-35.6751470,-71.5429690
CM:7.369721999999999,12.3547220
CN:35.861660,104.1953970
CO:39.55005070,-105.78206740
CR:9.748916999999999,-83.7534280
CS:39.56441050,16.25221430
CU:21.5217570,-77.7811670
CV:16.0020820,-24.0131970
CX:-10.4475250,105.6904490
CY:35.1264130,33.4298590
CZ:49.81749199999999,15.4729620
DE:51.165691,10.451526
DJ:11.8251380,42.5902750
DK:56.263920,9.5017850
DM:15.4149990,-61.37097600000001
DO:18.7356930,-70.1626510
DZ:28.0338860,1.6596260
EC:-32.29684020,26.4193890
EE:58.5952720,25.0136070
EG:26.8205530,30.8024980
EH:24.2155270,-12.8858340
ER:15.1793840,39.7823340
ES:-19.18342290,-40.30886260
ET:9.145000000000001,40.4896730
FI:61.92410999999999,25.7481510
FJ:-17.7133710,178.0650320
FK:-51.7962530,-59.5236130
FM:-25.39459690,-58.73736339999999
FO:-25.39459690,-58.73736339999999
FR:46.2276380,2.2137490
FX:27.9026210,-82.7447310
GA:32.15743510,-82.90712300000001
GB:55.3780510,-3.4359730
GD:12.11650,-61.67899999999999
GE:52.0451550,5.871823399999999
GF:3.9338890,-53.1257820
GH:7.9465270,-1.0231940
GI:36.1377410,-5.3453740
GL:71.7069360,-42.6043030
GM:13.4431820,-15.3101390
GN:9.9455870,-9.6966450
GP:-26.27075930,28.11226790
GQ:1.6508010,10.2678950
GR:39.0742080,21.8243120
GS:-54.4295790,-36.5879090
GT:15.7834710,-90.23075899999999
GU:13.4443040,144.7937310
GW:11.8037490,-15.1804130
GY:4.8604160,-58.930180
HK:22.3964280,114.1094970
HM:-53.081810,73.50415799999999
HN:15.1999990,-86.2419050
HR:45.10,15.20
HT:18.9711870,-72.28521499999999
HU:47.1624940,19.5033040
ID:44.06820190,-114.74204080
IE:53.412910,-8.243890
IL:40.63312490,-89.39852830
IN:40.26719410,-86.13490190
IO:-6.3431940,71.8765190
IQ:33.2231910,43.6792910
IR:32.4279080,53.6880460
IS:64.96305099999999,-19.0208350
IT:41.871940,12.567380
JM:18.1095810,-77.29750799999999
JO:30.5851640,36.2384140
JP:36.2048240,138.2529240
KE:-0.0235590,37.9061930
KG:41.204380,74.7660980
KH:12.5656790,104.9909630
KI:-3.3704170,-168.7340390
KM:-11.8750010,43.8722190
KN:17.3578220,-62.7829980
KP:40.3398520,127.5100930
KR:35.9077570,127.7669220
KW:29.311660,47.4817660
KY:37.83933320,-84.27001790
KZ:48.0195730,66.92368399999999
LA:31.24482340,-92.14502449999999
LB:33.8547210,35.8622850
LC:45.93829410,9.3857290
LI:51.44272380,6.06087260
LK:7.873053999999999,80.77179699999999
LR:6.4280550,-9.429499000000002
LS:-29.6099880,28.2336080
LT:55.1694380,23.8812750
LU:49.8152730,6.129582999999999
LV:56.8796350,24.6031890
LY:26.33510,17.2283310
MA:42.40721070,-71.38243740
MC:43.73841760000001,7.424615799999999
MD:39.04575490,-76.64127119999999
MG:-17.9301780,-43.79084530
MH:19.75147980,75.71388840
MK:41.6086350,21.7452750
ML:17.5706920,-3.9961660
MM:21.9139650,95.95622299999999
MN:46.7295530,-94.68589980
MO:37.96425290,-91.83183339999999
MP:-25.5653360,30.52790960
MQ:14.6415280,-61.0241740
MR:21.007890,-10.9408350
MS:32.35466790,-89.39852830
MT:46.87968220,-110.36256580
MU:-20.3484040,57.55215200000001
MV:3.2027780,73.220680
MW:-13.2543080,34.3015250
MX:23.6345010,-102.5527840
MY:4.2104840,101.9757660
MZ:-18.6656950,35.5295620
NA:-22.957640,18.490410
NC:35.75957310,-79.01929969999999
NE:41.49253740,-99.90181310
NF:-29.0408350,167.9547120
NG:9.0819990,8.675276999999999
NI:12.8654160,-85.2072290
NL:53.13550910,-57.66043640
NO:48.10807699999999,15.80495580
NP:28.3948570,84.12400799999999
NR:-0.5227780,166.9315030
NU:70.29977110,-83.10757690
NZ:-40.9005570,174.8859710
OM:21.5125830,55.9232550
PA:41.20332160,-77.19452470
PE:46.5107120,-63.41681359999999
PF:-17.6797420,-149.4068430
PG:5.263234100000001,100.48462270
PH:12.8797210,121.7740170
PK:30.3753210,69.34511599999999
PL:51.9194380,19.1451360
PM:46.9419360,-56.271110
PN:-24.7036150,-127.4393080
PR:-25.25208880,-52.02154150
PS:31.9521620,35.2331540
PT:39.39987199999999,-8.2244540
PW:7.514979999999999,134.582520
PY:-23.4425030,-58.4438320
QA:25.3548260,51.1838840
RE:-21.1151410,55.5363840
RO:45.9431610,24.966760
RU:61.524010,105.3187560
RW:-1.9402780,29.8738880
SA:23.8859420,45.0791620
SB:-9.645709999999999,160.1561940
SC:33.8360810,-81.16372450
SD:43.96951480,-99.90181310
SE:60.12816100000001,18.6435010
SG:1.3520830,103.8198360
SH:54.20907680,9.5889410
SI:46.1512410,14.9954630
SJ:-30.87245870,-68.52471489999999
SK:52.93991590,-106.45086390
SL:-33.87690180,-66.23671720
SM:43.942360,12.4577770
SN:14.4974010,-14.4523620
SO:5.1521490,46.1996160
SR:3.9193050,-56.0277830
ST:0.186360,6.613080999999999
SU:30.6516520,104.0759310
SV:46.8181880,8.227511999999999
SY:34.80207499999999,38.9968150
SZ:-26.5225030,31.4658660
TC:21.6940250,-71.7979280
TD:15.4541660,18.7322070
TF:-53.86711170,-69.2972140
TG:8.6195430,0.8247820
TH:15.8700320,100.9925410
TJ:38.8610340,71.2760930
TK:-8.967362999999999,-171.8558810
TL:-8.8742170,125.7275390
TM:38.9697190,59.5562780
TN:35.51749130,-86.58044730
TO:-11.40987370,-48.71914229999999
TP:37.87774020,12.71351210
TR:38.9637450,35.2433220
TT:10.6918030,-61.2225030
TV:45.78572920,12.19702880
TW:23.697810,120.9605150
TZ:-6.3690280,34.8888220
UA:48.3794330,31.165580
UG:1.3733330,32.2902750
UK:55.3780510,-3.4359730
UM:14.00451050,-176.70562750
US:37.090240,-95.7128910
UY:-32.5227790,-55.7658350
UZ:41.3774910,64.5852620
VA:37.43157340,-78.65689420
VC:12.9843050,-61.2872280
VE:6.423750,-66.589730
VG:18.4206950,-64.6399680
VI:18.3357650,-64.89633499999999
VN:14.0583240,108.2771990
VU:-15.3767060,166.9591580
WF:-13.7687520,-177.1560970
WS:-13.7590290,-172.1046290
YE:15.5527270,48.5163880
YT:64.28232740,-135.0
YU:39.8408430,114.5889030
ZA:-30.5594820,22.9375060
ZM:-13.1338970,27.8493320
ZR:51.80736570,5.70867610
ZW:-19.0154380,29.1548570
BIZ:42.91333330,44.17611110
COM:45.81203170,9.085614999999999
EDU:38.5333020,-121.7879780
GOV:-12.27130120,136.82331380
INT:36.13685970,-80.22767949999999
MIL:60.164480,132.6396450
NET:58.05874000000001,138.2498550
ORG:30.06971249999999,-93.79811680
PRO:41.82904250,-94.15938679999999
AERO:54.85890260,10.38748130
ARPA:39.70400050000001,45.12065270000001
COOP:34.14125450,-118.37270070
INFO:3.134430,101.686250
NAME:27.70287990,85.32163220
NATO:50.8762830,4.4219710

View File

@@ -0,0 +1,38 @@
{
"elastic" : {
"host" : "127.0.0.1:9200",
"use_ssl" : false,
"index" : "nxapi",
"doctype" : "events",
"default_ttl" : "7200",
"max_size" : "1000",
"version" : "2"
},
"syslogd": {
"host" : "0.0.0.0",
"port" : "51400"
},
"global_filters" : {
"whitelisted" : "false"
},
"global_warning_rules" : {
"rule_ip" : ["<=", 10 ],
"global_rule_ip_ratio" : ["<", 5]
},
"global_success_rules" : {
"global_rule_ip_ratio" : [">=", 10],
"rule_ip" : [">=", 10]
},
"global_deny_rules" : {
"global_rule_ip_ratio" : ["<", 2]
},
"naxsi" : {
"rules_path" : "/etc/nginx/naxsi_core.rules",
"template_path" : [ "tpl/"],
"geoipdb_path" : "nx_datas/country2coords.txt"
},
"output" : {
"colors" : "true",
"verbosity" : "5"
}
}

View File

View File

@@ -0,0 +1,540 @@
# Parses a line of log, and potentially returns a dict of dict.
import sys
import pprint
import time
import glob
import logging
import string
import urlparse
import itertools
import gzip
import bz2
from select import select
from functools import partial
import datetime
#import urllib2 as urllib
import json
import copy
from elasticsearch.helpers import bulk
import os
import socket
class NxReader():
""" Feeds the given injector from logfiles """
def __init__(self, acquire_fct, stdin=False, lglob=[], fd=None,
stdin_timeout=5, syslog=None, syslogport=None, sysloghost=None):
self.acquire_fct = acquire_fct
self.files = []
self.timeout = stdin_timeout
self.stdin = False
self.fd = fd
self.syslog = syslog
self.syslogport = syslogport
self.sysloghost = sysloghost
if stdin is not False:
logging.warning("Using stdin")
self.stdin = True
return
if len(lglob) > 0:
for regex in lglob:
self.files.extend(glob.glob(regex))
logging.warning("List of files :"+str(self.files))
if self.fd is not None:
logging.warning("Reading from supplied FD (fifo ?)")
if self.syslog is not None:
logging.warning("Reading from syslog socket")
def read_fd(self, fd):
if self.timeout is not None:
rlist, _, _ = select([fd], [], [], self.timeout)
else:
rlist, _, _ = select([fd], [], [])
success = discard = not_nx = malformed = 0
if rlist:
s = fd.readline()
if s == '':
return s
self.acquire_fct(s)
return True
else:
return False
def read_syslog(self, syslog):
if self.syslogport is not None:
host = self.sysloghost
port = int(self.syslogport)
else:
print "Unable to get syslog host and port"
sys.exit(1)
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
try:
s.bind((host,port))
s.listen(10)
except socket.error as msg:
print 'Bind failed. Error Code : ' + str(msg[0]) + ' Message ' + msg[1]
pass
print "Listening for syslog incoming "+host+" port "+ str(self.syslogport)
conn, addr = s.accept()
syslog = conn.recv(1024)
if syslog == '':
return False
conn.send(syslog)
self.acquire_fct(syslog)
return True
def read_files(self):
if self.fd is not None:
while True:
ret = self.read_fd(self.fd)
if ret == '':
return False
return 0
if self.syslog is not None:
ret = ""
while self.read_syslog(self.syslog) is True:
pass
return 0
count = 0
total = 0
for lfile in self.files:
success = not_nx = discard = malformed = fragmented = reunited = 0
logging.info("Importing file "+lfile)
try:
if lfile.endswith(".gz"):
print "GZ open"
fd = gzip.open(lfile, "rb")
elif lfile.endswith(".bz2"):
print "BZ2 open"
fd = bz2.BZ2File(lfile, "r")
else:
print "log open"
fd = open(lfile, "r")
except:
logging.critical("Unable to open file : "+lfile)
return 1
for line in fd:
self.acquire_fct(line)
fd.close()
return 0
class NxParser():
def __init__(self):
# output date format
self.out_date_format = "%Y/%m/%d %H:%M:%S"
# Start of Data / End of data marker
self.sod_marker = [' [error] ', ' [debug] ']
self.eod_marker = [', client: ', '']
# naxsi data keywords
self.naxsi_keywords = [" NAXSI_FMT: ", " NAXSI_EXLOG: "]
# keep track of fragmented lines (seed_start=X seed_end=X)
self.reunited_lines = 0
self.fragmented_lines = 0
self.multiline_buf = {}
# store generated objects
self.dict_buf = []
self.bad_line = 0
def unify_date(self, date):
""" tries to parse a text date,
returns date object or None on error """
idx = 0
res = ""
supported_formats = [
"%b %d %H:%M:%S",
"%b %d %H:%M:%S",
"%Y/%m/%d %H:%M:%S",
"%Y-%m-%d %H:%M:%S",
"%Y-%m-%dT%H:%M:%S"
# "%Y-%m-%dT%H:%M:%S+%:z"
]
while date[idx] == " " or date[idx] == "\t":
idx += 1
success = 0
for date_format in supported_formats:
nb_sp = date_format.count(" ")
clean_date = string.join(date.split(" ")[:nb_sp+1], " ")
# strptime does not support numeric time zone, hack.
idx = clean_date.find("+")
if idx != -1:
clean_date = clean_date[:idx]
try:
x = time.strptime(clean_date, date_format)
z = time.strftime(self.out_date_format, x)
success = 1
break
except:
#print "'"+clean_date+"' not in format '"+date_format+"'"
pass
if success == 0:
logging.critical("Unable to parse date format :'"+date+"'")
return None
return z
# returns line, ready for parsing.
# returns none if line contains no naxsi data
def clean_line(self, line):
""" returns an array of [date, "NAXSI_..."] from a
raw log line. 2nd item starts at first naxsi keyword
found. """
ret = [None, None]
# Don't try to parse if no naxsi keyword is found
for word in self.naxsi_keywords:
idx = line.find(word)
if idx != -1:
break
if idx == -1:
return None
line = line.rstrip('\n')
for mark in self.sod_marker:
date_end = line.find(mark)
if date_end != -1:
break
for mark in self.eod_marker:
if mark == '':
data_end = len(line)
break
data_end = line.find(mark)
if data_end != -1:
break
if date_end == -1 or data_end == 1:
self.bad_line += 1
return None
ret[0] = self.unify_date(line[:date_end])
chunk = line[date_end:data_end]
md = None
for word in self.naxsi_keywords:
idx = chunk.find(word)
if (idx != -1):
ret[1] = chunk[idx+len(word):]
if ret[1] is None:
self.bad_line += 1
return None
return ret
# attempts to clean and parse a line
def parse_raw_line(self, line):
clean_dict = self.clean_line(line)
if clean_dict is None:
logging.debug("not a naxsi line")
return None
nlist = self.parse_line(clean_dict[1])
if nlist is None:
return None
return {'date' : clean_dict[0], 'events' : nlist}
def parse_line(self, line):
ndict = self.tokenize_log(line)
if ndict is None:
logging.critical("Unable to tokenize line "+line)
return None
nlist = self.demult_exception(ndict)
return nlist
def demult_exception(self, event):
demult = []
if event.get('seed_start') and event.get('seed_end') is None:
#First line of a multiline naxsi fmt
self.multiline_buf[event['seed_start']] = event
self.fragmented_lines += 1
return demult
elif event.get('seed_start') and event.get('seed_end'):
# naxsi fmt is very long, at least 3 lines
self.fragmented_lines += 1
if self.multiline_buf.get(event['seed_end']) is None:
logging.critical("Orphans end {0} / start {1}".format(event['seed_end'],
event['seed_start']))
return demult
self.multiline_buf[event['seed_end']].update(event)
self.multiline_buf[event['seed_start']] = self.multiline_buf[event['seed_end']]
del self.multiline_buf[event['seed_end']]
return demult
elif event.get('seed_start') is None and event.get('seed_end'):
# last line of the naxsi_fmt, just update the dict, and parse it like a normal line
if self.multiline_buf.get(event['seed_end']) is None:
logging.critical('Got a line with seed_end {0}, but i cant find a matching seed_start...\nLine will probably be incomplete'.format(event['seed_end']))
return demult
self.fragmented_lines += 1
self.reunited_lines += 1
self.multiline_buf[event['seed_end']].update(event)
event = self.multiline_buf[event['seed_end']]
del self.multiline_buf[event['seed_end']]
entry = {}
for x in ['uri', 'server', 'content', 'ip', 'date', 'var_name', 'country']:
entry[x] = event.get(x, '')
clean = entry
# NAXSI_EXLOG lines only have one triple (zone,id,var_name), but has non-empty content
if 'zone' in event.keys():
if 'var_name' in event.keys():
entry['var_name'] = event['var_name']
entry['zone'] = event['zone']
entry['id'] = event['id']
demult.append(entry)
return demult
# NAXSI_FMT can have many (zone,id,var_name), but does not have content
# we iterate over triples.
elif 'zone0' in event.keys():
commit = True
for i in itertools.count():
entry = copy.deepcopy(clean)
zn = ''
vn = ''
rn = ''
if 'var_name' + str(i) in event.keys():
entry['var_name'] = event['var_name' + str(i)]
if 'zone' + str(i) in event.keys():
entry['zone'] = event['zone' + str(i)]
else:
commit = False
break
if 'id' + str(i) in event.keys():
entry['id'] = event['id' + str(i)]
else:
commit = False
break
if commit is True:
demult.append(entry)
else:
logging.warning("Malformed/incomplete event [missing subfield]")
logging.info(pprint.pformat(event))
return demult
return demult
else:
logging.warning("Malformed/incomplete event [no zone]")
logging.info(pprint.pformat(event))
return demult
def tokenize_log(self, line):
"""Parses a naxsi exception to a dict,
1 on error, 0 on success"""
odict = urlparse.parse_qs(line)
# one value per key, reduce.
for x in odict.keys():
odict[x][0] = odict[x][0].replace('\n', "\\n")
odict[x][0] = odict[x][0].replace('\r', "\\r")
odict[x] = odict[x][0]
# check for incomplete/truncated lines
if 'zone0' in odict.keys():
for i in itertools.count():
is_z = is_id = False
if 'zone' + str(i) in odict.keys():
is_z = True
if 'id' + str(i) in odict.keys():
is_id = True
if is_z is True and is_id is True:
continue
if is_z is False and is_id is False:
break
# clean our mess if we have to.
try:
del (odict['zone' + str(i)])
del (odict['id' + str(i)])
del (odict['var_name' + str(i)])
except:
pass
break
return odict
class NxInjector():
def __init__(self, auto_commit_limit=400):
self.nlist = []
self.auto_commit = auto_commit_limit
self.total_objs = 0
self.total_commits = 0
# optional
def get_ready(self):
pass
def insert(self, obj):
self.nlist.append(obj)
if self.auto_commit > 0 and len(self.nlist) > self.auto_commit:
return self.commit()
return True
def commit(self):
return False
def stop(self):
self.commit()
pass
class ESInject(NxInjector):
def __init__(self, es, cfg, auto_commit_limit=400):
#
# self.nlist = []
# self.auto_commit = auto_commit_limit
# super(ESInject, self).__init__(value=20)
NxInjector.__init__(self, auto_commit_limit)
self.es = es
self.cfg = cfg
self.es_version = cfg["elastic"]["version"]
# self.host = host
# self.index = index
# self.collection = collection
# self.login = login
# self.password = password
self.set_mappings()
# def esreq(self, pidx_uri, data, method="PUT"):
# try:
# body = json.dumps(data)
# except:
# print "Unable to dumps data :"+data
# return False
# try:
# print "=>>"+"http://"+self.host+"/"+self.index+pidx_uri
# req = urllib.Request("http://"+self.host+"/"+self.index+pidx_uri, data=body)
# f = urllib.urlopen(req)
# resp = f.read()
# print resp
# f.close()
# except:
# # import traceback
# # print 'generic exception: ' + traceback.format_exc()
# # print "!!Unexpected error:", sys.exc_info()[0]
# #print resp
# logging.critical("Unable to emit request.")
# sys.exit(-1)
# return False
# return True
def set_mappings(self):
if self.es_version == '5':
try:
self.es.indices.create(
index=self.cfg["elastic"]["index"],
ignore=400 # Ignore 400 cause by IndexAlreadyExistsException when creating an index
)
except Exception as idxadd_error:
print "Unable to create the index/collection for ES 5.X: "+self.cfg["elastic"]["index"]+" "+self.cfg["elastic"]["doctype"]+ ", Error: " + str(idxadd_error)
try:
self.es.indices.put_mapping(
index=self.cfg["elastic"]["index"],
doc_type=self.cfg["elastic"]["doctype"],
body={
"events" : {
# * (Note: The _timestamp and _ttl fields were deprecated and are now removed in ES 5.X.
# deleting documents from an index is very expensive compared to deleting whole indexes.
# That is why time based indexes are recommended over this sort of thing and why
# _ttl was deprecated in the first place)
#"_ttl" : { "enabled" : "true", "default" : "4d" },
"properties" : { "var_name" : {"type": "keyword"},
"uri" : {"type": "keyword"},
"zone" : {"type": "keyword"},
"server" : {"type": "keyword"},
"whitelisted" : {"type" : "keyword"},
"ip" : {"type" : "keyword"}
}
}
})
except Exception as mapset_error:
print "Unable to set mapping on index/collection for ES 5.X: "+self.cfg["elastic"]["index"]+" "+self.cfg["elastic"]["doctype"]+", Error: "+str(mapset_error)
return
else:
try:
self.es.create(
index=self.cfg["elastic"]["index"],
doc_type=self.cfg["elastic"]["doctype"],
# id=repo_name,
body={},
ignore=409 # 409 - conflict - would be returned if the document is already there
)
except Exception as idxadd_error:
print "Unable to create the index/collection : "+self.cfg["elastic"]["index"]+" "+self.cfg["elastic"]["doctype"]+", Error: "+str(idxadd_error)
return
try:
self.es.indices.put_mapping(
index=self.cfg["elastic"]["index"],
doc_type=self.cfg["elastic"]["doctype"],
body={
"events" : {
"_ttl" : { "enabled" : "true", "default" : "4d" },
"properties" : { "var_name" : {"type": "string", "index":"not_analyzed"},
"uri" : {"type": "string", "index":"not_analyzed"},
"zone" : {"type": "string", "index":"not_analyzed"},
"server" : {"type": "string", "index":"not_analyzed"},
"whitelisted" : {"type" : "string", "index":"not_analyzed"},
"content" : {"type" : "string", "index":"not_analyzed"},
"ip" : { "type" : "string", "index":"not_analyzed"}
}
}
})
except Exception as mapset_error:
print "Unable to set mapping on index/collection : "+self.cfg["elastic"]["index"]+" "+self.cfg["elastic"]["doctype"]+", Error: "+str(mapset_error)
return
def commit(self):
"""Process list of dict (yes) and push them to DB """
self.total_objs += len(self.nlist)
count = 0
full_body = ""
items = []
for evt_array in self.nlist:
for entry in evt_array['events']:
items.append({"index" : {}})
entry['whitelisted'] = "false"
entry['comments'] = "import:"+str(datetime.datetime.now())
# go utf-8 ?
for x in entry.keys():
if isinstance(entry[x], basestring):
entry[x] = unicode(entry[x], errors='replace')
items.append(entry)
count += 1
mapfunc = partial(json.dumps, ensure_ascii=False)
try:
full_body = "\n".join(map(mapfunc,items)) + "\n"
except:
print "Unexpected error:", sys.exc_info()[0]
print "Unable to json.dumps : "
pprint.pprint(items)
bulk(self.es, items, index=self.cfg["elastic"]["index"], doc_type="events", raise_on_error=True)
self.total_commits += count
logging.debug("Written "+str(self.total_commits)+" events")
print "Written "+str(self.total_commits)+" events"
del self.nlist[0:len(self.nlist)]
class NxGeoLoc():
def __init__(self, cfg):
self.cfg = cfg
try:
import GeoIP
except ImportError:
logging.warning("""Python's GeoIP module is not present.
'World Map' reports won't work,
and you can't use per-country filters.""")
raise
if not os.path.isfile(self.cfg["naxsi"]["geoipdb_path"]):
logging.error("Unable to load GeoIPdb.")
raise ValueError
self.gi = GeoIP.new(GeoIP.GEOIP_MEMORY_CACHE)
def cc2ll(self, country):
""" translates countrycode to lagitude, longitude """
# pun intended
coord = [37.090240,-95.7128910]
try:
fd = open(self.cfg["naxsi"]["geoipdb_path"], "r")
except:
return "Unable to open GeoLoc database, please check your setup."
fd.seek(0)
for cn in fd:
if cn.startswith(country+":"):
x = cn[len(country)+1:-1]
ar = x.split(',')
coord[0] = float(ar[1])
coord[1] = float(ar[0])
break
return coord
def ip2cc(self, ip):
""" translates an IP to a country code """
country = self.gi.country_code_by_addr(ip)
# pun intended
if country is None or len(country) < 2:
country = "CN"
return country
def ip2ll(self, ip):
return self.cc2ll(self.ip2cc(ip))

View File

@@ -0,0 +1,741 @@
import logging
import json
import copy
import operator
import os
import pprint
import shlex
import datetime
import glob
import sys
from nxtypificator import Typificator
class NxConfig():
""" Simple configuration loader """
cfg = {}
def __init__(self, fname):
try:
self.cfg = (json.loads(open(fname).read()))
except:
logging.critical("Unable to open/parse configuration file.")
raise ValueError
class NxRating():
""" A class that is used to check success criterias of rule.
attempts jit querying + caching """
def __init__(self, cfg, es, tr):
self.tr = tr
self.cfg = cfg
self.es = es
self.esq = {
'global' : None,
'template' : None,
'rule' : None}
self.stats = {
'global' : {},
'template' : {},
'rule' : {}
}
self.global_warnings = cfg["global_warning_rules"]
self.global_success = cfg["global_success_rules"]
self.global_deny = cfg["global_deny_rules"]
def drop(self):
""" clears all existing stats """
self.stats['template'] = {}
self.stats['global'] = {}
self.stats['rule'] = {}
def refresh_scope(self, scope, esq):
""" drops all datas for a named scope """
if scope not in self.esq.keys():
print "Unknown scope ?!"+scope
self.esq[scope] = esq
self.stats[scope] = {}
def query_ratio(self, scope, scope_small, score, force_refresh):
""" wrapper to calculate ratio between two vals, rounded float """
#print "ratio :"+str(self.get(scope_small, score))+" / "+str( self.get(scope, score))
ratio = round( (float(self.get(scope_small, score)) / self.get(scope, score)) * 100.0, 2)
return ratio
def get(self, scope, score, scope_small=None, force_refresh=False):
""" fetch a value from self.stats or query ES """
#print "#GET:"+scope+"_?"+str(scope_small)+"?_"+score+" = ?"
if scope not in self.stats.keys():
#print "unknown scope :"+scope
return None
if scope_small is not None:
return self.query_ratio(scope, scope_small, score, force_refresh)
elif score in self.stats[scope].keys() and force_refresh is False:
return self.stats[scope][score]
else:
if score is not 'total':
self.stats[scope][score] = self.tr.fetch_uniques(self.esq[scope], score)['total']
else:
res = self.tr.search(self.esq[scope])
self.stats[scope][score] = res['hits']['total']
return self.stats[scope][score]
def check_rule_score(self, tpl):
""" wrapper to check_score, TOFIX ? """
return self.check_score(tpl_success=tpl.get('_success', None),
tpl_warnings=tpl.get('_warnings', None),
tpl_deny=tpl.get('_deny', None))
def check_score(self, tpl_success=None, tpl_warnings=None, tpl_deny=None):
# pprint.pprint(self.stats)
debug = False
success = []
warning = []
deny = False
failed_tests = {"success" : [], "warnings" : []}
glb_success = self.global_success
glb_warnings = self.global_warnings
glb_deny = self.global_deny
for sdeny in [tpl_deny, glb_deny]:
if sdeny is None:
continue
for k in sdeny.keys():
res = self.check_rule(k, sdeny[k])
if res['check'] is True:
# print "WE SHOULD DENY THAT"
deny = True
break
for scheck in [glb_success, tpl_success]:
if scheck is None:
continue
for k in scheck.keys():
res = self.check_rule(k, scheck[k])
if res['check'] is True:
if debug is True:
print "[SUCCESS] OK, on "+k+" vs "+str(res['curr'])+", check :"+str(scheck[k][0])+" - "+str(scheck[k][1])
success.append({'key' : k, 'criteria' : scheck[k], 'curr' : res['curr']})
else:
if debug is True:
print "[SUCCESS] KO, on "+k+" vs "+str(res['curr'])+", check :"+str(scheck[k][0])+" - "+str(scheck[k][1])
failed_tests["success"].append({'key' : k, 'criteria' : scheck[k], 'curr' : res['curr']})
for fcheck in [glb_warnings, tpl_warnings]:
if fcheck is None:
continue
for k in fcheck.keys():
res = self.check_rule(k, fcheck[k])
if res['check'] is True:
if debug is True:
print "[WARNINGS] TRIGGERED, on "+k+" vs "+str(res['curr'])+", check :"+str(fcheck[k][0])+" - "+str(fcheck[k][1])
warning.append({'key' : k, 'criteria' : fcheck[k], 'curr' : res['curr']})
else:
if debug is True:
print "[WARNINGS] NOT TRIGGERED, on "+k+" vs "+str(res['curr'])+", check :"+str(fcheck[k][0])+" - "+str(fcheck[k][1])
failed_tests["warnings"].append({'key' : k, 'criteria' : fcheck[k], 'curr' : res['curr']})
x = { 'success' : success,
'warnings' : warning,
'failed_tests' : failed_tests,
'deny' : deny}
return x
def check_rule(self, label, check_rule):
""" check met/failed success/warning criterias
of a given template vs a set of results """
check = check_rule[0]
beat = check_rule[1]
if label.find("var_name") != -1:
label = label.replace("var_name", "var-name")
items = label.split('_')
for x in range(len(items)):
items[x] = items[x].replace("var-name", "var_name")
if len(items) == 2:
scope = items[0]
score = items[1]
x = self.get(scope, score)
# print "scope:"+str(scope)+" score:"+str(score)
return {'curr' : x, 'check' : check( int(self.get(scope, score)), int(beat))}
elif len(items) == 4:
scope = items[0]
scope_small = items[1]
score = items[2]
x = self.get(scope, score, scope_small=scope_small)
#Xpprint.pprint()
return {'curr' : x, 'check' : check(int(self.get(scope, score, scope_small=scope_small)), int(beat))}
else:
print "cannot understand rule ("+label+"):",
pprint.pprint(check_rule)
return { 'curr' : 0, 'check' : False }
class NxTranslate():
""" Transform Whitelists, template into
ElasticSearch queries, and vice-versa, conventions :
esq : elasticsearch query
tpl : template
cr : core rule
wl : whitelist """
def __init__(self, es, cfg):
self.es = es
self.debug = True
self.cfg = cfg.cfg
self.cfg["global_warning_rules"] = self.normalize_checks(self.cfg["global_warning_rules"])
self.cfg["global_success_rules"] = self.normalize_checks(self.cfg["global_success_rules"])
self.cfg["global_deny_rules"] = self.normalize_checks(self.cfg["global_deny_rules"])
self.core_msg = {}
# by default, es queries will return 1000 results max
self.es_max_size = self.cfg.get("elastic").get("max_size", 1000)
print "# size :"+str(self.es_max_size)
# purely for output coloring
self.red = u'{0}'
self.grn = u'{0}'
self.blu = u'{0}'
if self.cfg["output"]["colors"] == "true":
self.red = u"\033[91m{0}\033[0m"
self.grn = u"\033[92m{0}\033[0m"
self.blu = u"\033[94m{0}\033[0m"
# Attempt to parse provided core rules file
self.load_cr_file(self.cfg["naxsi"]["rules_path"])
def full_auto(self, to_fill_list=None):
""" Loads all tpl within template_path
If templates has hit, peers or url(s) ratio > 15%,
attempts to generate whitelists.
Only displays the wl that did not raise warnings, ranked by success"""
# gather total IPs, total URIs, total hit count
scoring = NxRating(self.cfg, self.es, self)
strict = True
if self.cfg.get("naxsi").get("strict", "") == "false":
strict = False
scoring.refresh_scope("global", self.cfg["global_filters"])
if scoring.get("global", "ip") <= 0:
return []
output = []
for sdir in self.cfg["naxsi"]["template_path"]:
for root, dirs, files in os.walk(sdir):
for file in files:
if file.endswith(".tpl"):
output.append("# {0}{1}/{2} ".format(
self.grn.format(" template :"),
root,
file
))
template = self.load_tpl_file(root+"/"+file)
scoring.refresh_scope('template', self.tpl2esq(template))
output.append("Nb of hits : {0}".format(scoring.get('template', 'total')))
if scoring.get('template', 'total') > 0:
output.append('{0}'.format(self.grn.format("# template matched, generating all rules.")))
whitelists = self.gen_wl(template, rule={})
# x add here
output.append('{0}'.format(len(whitelists))+" whitelists ...")
for genrule in whitelists:
scoring.refresh_scope('rule', genrule['rule'])
results = scoring.check_rule_score(template)
# XX1
if (len(results['success']) > len(results['warnings']) and results["deny"] == False) or self.cfg["naxsi"]["strict"] == "false":
# print "?deny "+str(results['deny'])
try:
str_genrule = '{0}'.format(self.grn.format(self.tpl2wl(genrule['rule']).encode('utf-8', 'replace'), template))
except UnicodeDecodeError:
logging.warning('WARNING: Unprocessable string found in the elastic search')
output.append(self.fancy_display(genrule, results, template))
output.append(str_genrule)
if to_fill_list is not None:
genrule.update({'genrule': str_genrule})
to_fill_list.append(genrule)
return output
def wl_on_type(self):
for rule in Typificator(self.es, self.cfg).get_rules():
print 'BasicRule negative "rx:{0}" "msg:{1}" "mz:${2}_VAR:{3}" "s:BLOCK";'.format(*rule)
def fancy_display(self, full_wl, scores, template=None):
output = []
if template is not None and '_msg' in template.keys():
output.append("#msg: {0}\n".format(template['_msg']))
rid = full_wl['rule'].get('id', "0")
output.append("#Rule ({0}) {1}\n".format(rid, self.core_msg.get(rid, 'Unknown ..')))
if self.cfg["output"]["verbosity"] >= 4:
output.append("#total hits {0}\n".format(full_wl['total_hits']))
for x in ["content", "peers", "uri", "var_name"]:
if x not in full_wl.keys():
continue
for y in full_wl[x]:
output.append("#{0} : {1}\n".format(x, unicode(y).encode("utf-8", 'replace')))
return ''.join(output)
# pprint.pprint(scores)
for x in scores['success']:
print "# success : "+self.grn.format(str(x['key'])+" is "+str(x['curr']))
for x in scores['warnings']:
print "# warnings : "+self.grn.format(str(x['key'])+" is "+str(x['curr']))
pass
def expand_tpl_path(self, template):
""" attempts to convert stuff to valid tpl paths.
if it starts with / or . it will consider it's a relative/absolute path,
else, that it's a regex on tpl names. """
clean_tpls = []
tpl_files = []
if template.startswith('/') or template.startswith('.'):
tpl_files.extend(glob.glob(template))
else:
for sdir in self.cfg['naxsi']['template_path']:
tpl_files.extend(glob.glob(sdir +"/"+template))
for x in tpl_files:
if x.endswith(".tpl") and x not in clean_tpls:
clean_tpls.append(x)
return clean_tpls
def load_tpl_file(self, tpl):
""" open, json.loads a tpl file,
cleanup data, return dict. """
try:
x = open(tpl)
except:
logging.error("Unable to open tpl file.")
return None
tpl_s = ""
for l in x.readlines():
if l.startswith('#'):
continue
else:
tpl_s += l
try:
template = json.loads(tpl_s)
except:
logging.error("Unable to load json from '"+tpl_s+"'")
return None
if '_success' in template.keys():
template['_success'] = self.normalize_checks(template['_success'])
if '_warnings' in template.keys():
template['_warnings'] = self.normalize_checks(template['_warnings'])
if '_deny' in template.keys():
template['_deny'] = self.normalize_checks(template['_deny'])
#return self.tpl_append_gfilter(template)
return template
def load_wl_file(self, wlf):
""" Loads a file of whitelists,
convert them to ES queries,
and returns them as a list """
esql = []
try:
wlfd = open(wlf, "r")
except:
logging.error("Unable to open whitelist file.")
return None
for wl in wlfd:
[res, esq] = self.wl2esq(wl)
if res is True:
esql.append(esq)
if len(esql) > 0:
return esql
return None
def load_cr_file(self, cr_file):
""" parses naxsi's core rule file, to
decorate output with "msg:" field content """
core_msg = {}
core_msg['0'] = "id:0 is wildcard (all rules) whitelist."
try:
fd = open(cr_file, 'r')
for i in fd:
if i.startswith('MainRule') or i.startswith('#@MainRule'):
pos = i.find('id:')
pos_msg = i.find('msg:')
self.core_msg[i[pos + 3:i[pos + 3].find(';') - 1]] = i[pos_msg + 4:][:i[pos_msg + 4:].find('"')]
fd.close()
except:
logging.warning("Unable to open rules file")
def tpl2esq(self, ob, full=True):
''' receives template or a rule, returns a valid
ElasticSearch query '''
qr = {
"query" : { "bool" : { "must" : [ ]} },
"size" : self.es_max_size
}
# A hack in case we were inadvertently given an esq
if 'query' in ob.keys():
return ob
for k in ob.keys():
if k.startswith("_"):
continue
# if key starts with '?' :
# use content for search, but use content from exceptions to generate WL
if k[0] == '?':
k = k[1:]
qr['query']['bool']['must'].append({"regexp" : { k : ob['?'+k] }})
# wildcard
elif ob[k] == '?':
pass
else:
qr['query']['bool']['must'].append({"match" : { k : ob[k]}})
qr = self.append_gfilter(qr)
return qr
def append_gfilter(self, esq):
""" append global filters parameters
to and existing elasticsearch query """
for x in self.cfg["global_filters"]:
if x.startswith('?'):
x = x[1:]
if {"regexp" : { x : self.cfg["global_filters"]['?'+x] }} not in esq['query']['bool']['must']:
esq['query']['bool']['must'].append({"regexp" : { x : self.cfg["global_filters"]['?'+x] }})
else:
if {"match" : { x : self.cfg["global_filters"][x] }} not in esq['query']['bool']['must']:
esq['query']['bool']['must'].append({"match" : { x : self.cfg["global_filters"][x] }})
return esq
def tpl_append_gfilter(self, tpl):
for x in self.cfg["global_filters"]:
tpl[x] = self.cfg["global_filters"][x]
return tpl
def wl2esq(self, raw_line):
""" parses a fulltext naxsi whitelist,
and outputs the matching es query (ie. for tagging),
returns [True|False, error_string|ESquery] """
esq = {
"query" : { "bool" : { "must" : [ ]} },
"size" : self.es_max_size
}
wl_id = ""
mz_str = ""
# do some pre-check to ensure it's a valid line
if raw_line.startswith("#"):
return [False, "commented out"]
if raw_line.find("BasicRule") == -1:
return [False, "not a BasicRule"]
# split line
strings = shlex.split(raw_line)
# bug #194 - drop everything after the first chunk starting with a '#' (inline comments)
for x in strings:
if x.startswith('#'):
strings = strings[:strings.index(x)]
# more checks
if len(strings) < 3:
return [False, "empty/incomplete line"]
if strings[0].startswith('#'):
return [False, "commented line"]
if strings[0] != "BasicRule":
return [False, "not a BasicRule, keyword '"+strings[0]+"'"]
if strings[len(strings) - 1].endswith(';'):
strings[len(strings) - 1] = strings[len(strings) - 1][:-1]
for x in strings:
if x.startswith("wl:"):
wl_id = x[3:]
# if ID contains "," replace them with OR for ES query
wl_id = wl_id.replace(",", " OR ")
# if ID != 0 add it, otherwise, it's a wildcard!
if wl_id != "0":
# if IDs are negative, we must exclude all IDs except
# those ones.
if wl_id.find("-") != -1:
wl_id = wl_id.replace("-", "")
#print "Negative query."
if not 'must_not' in esq['query']['bool'].keys():
esq['query']['bool']['must_not'] = []
esq['query']['bool']['must_not'].append({"match" : { "id" : wl_id}})
else:
esq['query']['bool']['must'].append({"match" : { "id" : wl_id}})
if x.startswith("mz:"):
mz_str = x[3:]
[res, filters] = self.parse_mz(mz_str, esq)
if res is False:
return [False, "matchzone parsing failed."]
esq = self.append_gfilter(esq)
return [True, filters]
def parse_mz(self, mz_str, esq):
""" parses a match zone from BasicRule, and updates
es query accordingly. Removes ^/$ chars from regexp """
forbidden_rx_chars = "^$"
kw = mz_str.split("|")
tpl = esq['query']['bool']['must']
uri = ""
zone = ""
var_name = ""
t_name = False
# |NAME flag
if "NAME" in kw:
t_name = True
kw.remove("NAME")
for k in kw:
# named var
if k.startswith('$'):
k = k[1:]
try:
[zone, var_name] = k.split(':')
except:
return [False, "Incoherent zone : "+k]
# *_VAR:<string>
if zone.endswith("_VAR"):
zone = zone[:-4]
if t_name is True:
zone += "|NAME"
tpl.append({"match" : { "zone" : zone}})
tpl.append({"match" : { "var_name" : var_name}})
# *_VAR_X:<regexp>
elif zone.endswith("_VAR_X"):
zone = zone[:-6]
if t_name is True:
zone += "|NAME"
tpl.append({"match" : { "zone" : zone}})
#.translate(string.maketrans(chars, newchars))
tpl.append({"regexp" : { "var_name" : var_name.translate(None, forbidden_rx_chars)}})
# URL_X:<regexp>
elif zone == "URL_X":
zone = zone[:-2]
tpl.append({"regexp" : { "uri" : var_name.translate(None, forbidden_rx_chars)}})
# URL:<string>
elif zone == "URL":
tpl.append({"match" : { "uri" : var_name }})
else:
print "huh, what's that ? "+zone
# |<ZONE>
else:
if k not in ["HEADERS", "BODY", "URL", "ARGS", "FILE_EXT"]:
return [False, "Unknown zone : '"+k+"'"]
zone = k
if t_name is True:
zone += "|NAME"
tpl.append({"match" : {"zone" : zone}})
# print "RULE :"
# pprint.pprint(esq)
return [True, esq]
def tpl2wl(self, rule, template=None):
""" transforms a rule/esq
to a valid BasicRule. """
tname = False
zone = ""
if template is not None and '_statics' in template.keys():
for x in template['_statics'].keys():
rule[x] = template['_statics'][x]
wl = "BasicRule "
wl += " wl:"+str(rule.get('id', 0)).replace("OR", ",").replace("|", ",").replace(" ", "")
wl += ' "mz:'
if rule.get('uri', None) is not None:
wl += "$URL:"+rule['uri']
wl += "|"
# whitelist targets name
if rule.get('zone', '').endswith("|NAME"):
tname = True
zone = rule['zone'][:-5]
else:
zone = rule['zone']
if rule.get('var_name', '') not in ['', '?'] and zone != "FILE_EXT":
wl += "$"+zone+"_VAR:"+rule['var_name']
else:
wl += zone
if tname is True:
wl += "|NAME"
wl += '";'
return wl
def fetch_top(self, template, field, limit=10):
""" fetch top items for a given field,
clears the field if exists in gfilters """
x = None
if field in template.keys():
x = template[field]
del template[field]
esq = self.tpl2esq(template)
if x is not None:
template[field] = x
if self.cfg["elastic"].get("version", None) == "1":
esq['facets'] = { "facet_results" : {"terms": { "field": field, "size" : self.es_max_size} }}
elif self.cfg["elastic"].get("version", None) in ["2", "5"]:
esq['aggregations'] = { "agg1" : {"terms": { "field": field, "size" : self.es_max_size} }}
else:
print "Unknown / Unspecified ES version in nxapi.json : {0}".format(self.cfg["elastic"].get("version", "#UNDEFINED"))
sys.exit(1)
res = self.search(esq)
if self.cfg["elastic"].get("version", None) == "1":
total = res['facets']['facet_results']['total']
elif self.cfg["elastic"].get("version", None) in ["2", "5"]:
total = res['hits']['total']
else:
print "Unknown / Unspecified ES version in nxapi.json : {0}".format(self.cfg["elastic"].get("version", "#UNDEFINED"))
sys.exit(1)
count = 0
ret = []
if self.cfg["elastic"].get("version", None) == "1":
for x in res['facets']['facet_results']['terms']:
ret.append('{0} {1}% (total: {2}/{3})'.format(x['term'], round((float(x['count']) / total) * 100, 2), x['count'], total))
count += 1
if count > limit:
break
elif self.cfg["elastic"].get("version", None) in ["2", "5"]:
for x in res['aggregations']['agg1']['buckets']:
ret.append('{0} {1}% (total: {2}/{3})'.format(x['key'], round((float(x['doc_count']) / total) * 100, 2), x['doc_count'], total))
count += 1
if count > limit:
break
else:
print "Unknown / Unspecified ES version in nxapi.json : {0}".format(self.cfg["elastic"].get("version", "#UNDEFINED"))
sys.exit(1)
return ret
def fetch_uniques(self, rule, key):
""" shortcut function to gather unique
values and their associated match count """
uniques = []
esq = self.tpl2esq(rule)
#
if self.cfg["elastic"].get("version", None) == "1":
esq['facets'] = { "facet_results" : {"terms": { "field": key, "size" : 50000} }}
elif self.cfg["elastic"].get("version", None) in ["2", "5"]:
esq['aggregations'] = { "agg1" : {"terms": { "field": key, "size" : 50000} }}
else:
print "Unknown / Unspecified ES version in nxapi.json : {0}".format(self.cfg["elastic"].get("version", "#UNDEFINED"))
sys.exit(1)
res = self.search(esq)
if self.cfg["elastic"].get("version", None) == "1":
for x in res['facets']['facet_results']['terms']:
if x['term'] not in uniques:
uniques.append(x['term'])
elif self.cfg["elastic"].get("version", None) in ["2", "5"]:
for x in res['aggregations']['agg1']['buckets']:
if x['key'] not in uniques:
uniques.append(x['key'])
else:
print "Unknown / Unspecified ES version in nxapi.json : {0}".format(self.cfg["elastic"].get("version", "#UNDEFINED"))
sys.exit(1)
return { 'list' : uniques, 'total' : len(uniques) }
def index(self, body, eid):
return self.es.index(index=self.cfg["elastic"]["index"], doc_type=self.cfg["elastic"]["doctype"], body=body, id=eid)
def search(self, esq, stats=False):
""" search wrapper with debug """
debug = False
if debug is True:
print "#SEARCH:PARAMS:index="+self.cfg["elastic"]["index"]+", doc_type="+self.cfg["elastic"]["doctype"]+", body=",
print "#SEARCH:QUERY:",
pprint.pprint (esq)
if len(esq["query"]["bool"]["must"]) == 0:
del esq["query"]
x = self.es.search(index=self.cfg["elastic"]["index"], doc_type=self.cfg["elastic"]["doctype"], body=esq)
if debug is True:
print "#RESULT:",
pprint.pprint(x)
return x
def normalize_checks(self, tpl):
""" replace check signs (<, >, <=, >=) by
operator.X in a dict-form tpl """
replace = {
'>' : operator.gt,
'<' : operator.lt,
'>=' : operator.ge,
'<=' : operator.le
}
for tpl_key in tpl.keys():
for token in replace.keys():
if tpl[tpl_key][0] == token:
tpl[tpl_key][0] = replace[token]
return tpl
def tag_events(self, esq, msg, tag=False):
""" tag events with msg + tstamp if they match esq """
count = 0
total_events = 0
esq["size"] = "0"
print "TAG RULE :",
pprint.pprint(esq)
x = self.search(esq)
total_events = int(str(x["hits"]["total"]))
print str(self.grn.format(total_events)) + " items to be tagged ..."
size = int(x['hits']['total'])
if size > 20000:
size = size / 100
elif size > 100:
size = size / 10
while count < total_events:
esq["size"] = size
esq["from"] = 0
res = self.search(esq)
# Iterate through matched evts to tag them.
if int(res['hits']['total']) == 0:
break
for item in res['hits']['hits']:
eid = item['_id']
body = item['_source']
cm = item['_source']['comments']
body['comments'] += ","+msg+":"+str(datetime.datetime.now())
body['whitelisted'] = "true"
if tag is True:
self.index(body, eid)
else:
print eid+",",
count += 1
print "Tagged {0} events out of {1}".format(count, total_events)
if total_events - count < size:
size = total_events - count
print ""
#--
if not tag or tag is False:
return 0
else:
return count
def gen_wl(self, tpl, rule={}):
""" recursive whitelist generation function,
returns a list of all possible witelists. """
retlist = []
for tpl_key in tpl.keys():
if tpl_key in rule.keys():
continue
if tpl_key[0] in ['_', '?']:
continue
if tpl[tpl_key] == '?':
continue
rule[tpl_key] = tpl[tpl_key]
for tpl_key in tpl.keys():
if tpl_key.startswith('_'):
continue
elif tpl_key.startswith('?'):
if tpl_key[1:] in rule.keys():
continue
unique_vals = self.fetch_uniques(rule, tpl_key[1:])['list']
for uval in unique_vals:
rule[tpl_key[1:]] = uval
retlist += self.gen_wl(tpl, copy.copy(rule))
return retlist
elif tpl[tpl_key] == '?':
if tpl_key in rule.keys():
continue
unique_vals = self.fetch_uniques(rule, tpl_key)['list']
for uval in unique_vals:
rule[tpl_key] = uval
retlist += self.gen_wl(tpl, copy.copy(rule))
return retlist
elif tpl_key not in rule.keys():
rule[tpl_key] = tpl[tpl_key]
retlist += self.gen_wl(tpl, copy.copy(rule))
return retlist
esq = self.tpl2esq(rule)
res = self.search(esq)
if res['hits']['total'] > 0:
clist = []
peers = []
uri = []
var_name = []
for x in res['hits']['hits']:
if len(x.get("_source").get("ip", "")) > 0 and x.get("_source").get("ip", "") not in peers:
peers.append(x["_source"]["ip"])
if len(x.get("_source").get("uri", "")) > 0 and x.get("_source").get("uri", "") not in uri:
uri.append(x["_source"]["uri"])
if len(x.get("_source").get("var_name", "")) > 0 and x.get("_source").get("var_name", "") not in var_name:
var_name.append(x["_source"]["var_name"])
if len(x.get("_source").get("content", "")) > 0 and x.get("_source").get("content", "") not in clist:
clist.append(x["_source"]["content"])
if len(clist) >= 5:
break
retlist.append({'rule' : rule, 'content' : clist[:5], 'total_hits' : res['hits']['total'], 'peers' : peers[:5], 'uri' : uri[:5],
'var_name' : var_name[:5]})
return retlist
return []

View File

@@ -0,0 +1,101 @@
'''
This modules generate types for url parameters.
'''
import re
import sys
import collections
from elasticsearch import Elasticsearch
# Each regexp is a subset of the next one
REGEXPS = [
[r'^$', 'empty'],
[r'^[01]$', 'boolean'],
[r'^\d+$', 'integer'],
[r'^#[0-9a-f]+$', 'colour'], # hex + '#'
[r'^[0-9a-f]+$', 'hexadecimal'],
[r'^[0-9a-z]+$', 'alphanum'],
[r'^https?://([0-9a-z-.]+\.)+[\w?+-=&/ ]+$', 'url'], # like http://pouet.net?hello=1&id=3
[r'^\w+$', 'alphanumdash'],
[r'^[0-9a-z?&=+_-]+$', 'url parameter'],
[r'^[\w[] ,&=+-]+$', 'array'],
[r'^[' + r'\s\w' + r'!$%^&*()[]:;@~#?/.,' + r']+$', 'plaintext'],
[r'', 'none'], # untypables parameters
]
class Typificator(object):
''' Classes that:
1. Fetch data from ES
2. Generate types for parameters
3. Returns a dict of dict
'''
def __init__(self, es, cfg):
self.es_instance = es
self.cfg = cfg
def __get_data(self, nb_samples=1e5):
''' Get (in a lazy way) data from the ES instance
'''
data = set()
position = 0
size = min(10000, nb_samples) # if nb_samples if inferiour to our size, we'll get it in a single request.
while nb_samples:
if not data:
body = {'query': {}}
for k,v in self.cfg['global_filters'].iteritems():
body['query'].update({'match':{k:v}})
data = self.es_instance.search(index=self.cfg["elastic"]["index"], doc_type='events',
size=size, from_=position,
body=body)
data = data['hits']['hits'] # we don't care about metadata
if not data: # we got all data from ES
return
position += size
nb_samples -= size
for log in data:
yield log['_source']
def get_rules(self, nb_samples=1e5):
''' Generate (in a lazy way) types for parameters
'''
# Thank you defaultdict <3
# rules = {zone1: {var1:0, var2:0}, zone2: {var6:0, ...}, ...}
rules = collections.defaultdict(lambda: collections.defaultdict(int))
# Compile regexp for speed
regexps = [re.compile(reg, re.IGNORECASE) for reg, _ in REGEXPS]
for line in self.__get_data(nb_samples):
try: # some events are fucked up^w^w empty
#naxsi inverts the var_name and the content
#when a rule match on var_name
if line['zone'].endswith('|NAME'):
continue
zone = line['zone']
content = line['content']
var_name = line['var_name']
except KeyError as e:
print 'Error with : {0} ({1})'.format(line, e)
continue
if not var_name: # No types for empty varnames.
continue
# Bump regexps until one matches
# Since every regexp is a subset of the next one,
# this works great.
while not regexps[rules[zone][var_name]].match(content):
rules[zone][var_name] += 1
for zone, zone_data in rules.iteritems():
for var_name, index in zone_data.iteritems():
if index < len(REGEXPS) - 1: # Don't return untyped things
yield [REGEXPS[index][0], REGEXPS[index][1], zone, var_name]
if __name__ == '__main__':
nb_samples = 1e6 if len(sys.argv) == 1 else int(sys.argv[1])
for rule in Typificator().get_rules(nb_samples):
print 'TypeRule "rx:{0}" "msg:typed ({1}) parameter" "mz:${2}_VAR:{3}"'.format(rule[0], rule[1], rule[2], rule[3])

479
naxsi-0.55.3/nxapi/nxtool.py Executable file
View File

@@ -0,0 +1,479 @@
#!/usr/bin/env python
import glob, fcntl, termios
import sys
import socket
import elasticsearch
import time
import os
import tempfile
import subprocess
import json
from collections import defaultdict
from optparse import OptionParser, OptionGroup
from nxapi.nxtransform import *
from nxapi.nxparse import *
F_SETPIPE_SZ = 1031 # Linux 2.6.35+
F_GETPIPE_SZ = 1032 # Linux 2.6.35+
def open_fifo(fifo):
try:
os.mkfifo(fifo)
except OSError:
print "Fifo ["+fifo+"] already exists (non fatal)."
except Exception, e:
print "Unable to create fifo ["+fifo+"]"
try:
print "Opening fifo ... will return when data is available."
fifo_fd = open(fifo, 'r')
fcntl.fcntl(fifo_fd, F_SETPIPE_SZ, 1000000)
print "Pipe (modified) size : "+str(fcntl.fcntl(fifo_fd, F_GETPIPE_SZ))
except Exception, e:
print "Unable to create fifo, error: "+str(e)
return None
return fifo_fd
def macquire(line):
z = parser.parse_raw_line(line)
# add data str and country
if z is not None:
for event in z['events']:
event['date'] = z['date']
try:
event['coords'] = geoloc.ip2ll(event['ip'])
event['country'] = geoloc.ip2cc(event['ip'])
except NameError:
pass
# print "Got data :)"
# pprint.pprint(z)
#print ".",
print z
injector.insert(z)
else:
pass
#print "No data ? "+line
#print ""
opt = OptionParser()
# group : config
p = OptionGroup(opt, "Configuration options")
p.add_option('-c', '--config', dest="cfg_path", default="/usr/local/etc/nxapi.json", help="Path to nxapi.json (config).")
p.add_option('--colors', dest="colors", action="store_false", default="true", help="Disable output colorz.")
# p.add_option('-q', '--quiet', dest="quiet_flag", action="store_true", help="Be quiet.")
# p.add_option('-v', '--verbose', dest="verb_flag", action="store_true", help="Be verbose.")
opt.add_option_group(p)
# group : in option
p = OptionGroup(opt, "Input options (log acquisition)")
p.add_option('--files', dest="files_in", help="Path to log files to parse.")
p.add_option('--fifo', dest="fifo_in", help="Path to a FIFO to be created & read from. [infinite]")
p.add_option('--stdin', dest="stdin", action="store_true", help="Read from stdin.")
p.add_option('--no-timeout', dest="infinite_flag", action="store_true", help="Disable timeout on read operations (stdin/fifo).")
p.add_option('--syslog', dest="syslog_in", action="store_true", help="Listen on tcp port for syslog logging.")
opt.add_option_group(p)
# group : filtering
p = OptionGroup(opt, "Filtering options (for whitelist generation)")
p.add_option('-s', '--server', dest="server", help="FQDN to which we should restrict operations.")
p.add_option('--filter', dest="filter", action="append", help="This option specify a filter for each type of filter, filter are merge with existing templates/filters. (--filter 'uri /foobar')")
opt.add_option_group(p)
# group : tagging
p = OptionGroup(opt, "Tagging options (tag existing events in database)")
p.add_option('-w', '--whitelist-path', dest="wl_file", help="A path to whitelist file, will find matching events in DB.")
p.add_option('-i', '--ip-path', dest="ips", help="A path to IP list file, will find matching events in DB.")
p.add_option('--tag', dest="tag", action="store_true", help="Actually tag matching items in DB.")
opt.add_option_group(p)
# group : whitelist generation
p = OptionGroup(opt, "Whitelist Generation")
p.add_option('-f', '--full-auto', dest="full_auto", action="store_true", help="Attempt fully automatic whitelist generation process.")
p.add_option('-t', '--template', dest="template", help="Path to template to apply.")
p.add_option('--slack', dest="slack", action="store_false", help="Enables less strict mode.")
p.add_option('--type', dest="type_wl", action="store_true", help="Generate whitelists based on param type")
opt.add_option_group(p)
# group : statistics
p = OptionGroup(opt, "Statistics Generation")
p.add_option('-x', '--stats', dest="stats", action="store_true", help="Generate statistics about current's db content.")
opt.add_option_group(p)
# group : interactive generation
p = OptionGroup(opt, "Interactive Whitelists Generation")
p.add_option('-g', '--interactive-generation', dest="int_gen", action="store_true", help="Use your favorite text editor for whitelist generation.")
opt.add_option_group(p)
(options, args) = opt.parse_args()
try:
cfg = NxConfig(options.cfg_path)
except ValueError:
sys.exit(-1)
if options.server is not None:
cfg.cfg["global_filters"]["server"] = options.server
# https://github.com/nbs-system/naxsi/issues/231
mutally_exclusive = ['stats', 'full_auto', 'template', 'wl_file', 'ips', 'files_in', 'fifo_in', 'syslog_in']
count=0
for x in mutally_exclusive:
if options.ensure_value(x, None) is not None:
count += 1
if count > 1:
print "Mutually exclusive options are present (ie. import and stats), aborting."
sys.exit(-1)
cfg.cfg["output"]["colors"] = "false" if options.int_gen else str(options.colors).lower()
cfg.cfg["naxsi"]["strict"] = str(options.slack).lower()
def get_filter(arg_filter):
x = {}
to_parse = []
kwlist = ['server', 'uri', 'zone', 'var_name', 'ip', 'id', 'content', 'country', 'date',
'?server', '?uri', '?var_name', '?content']
try:
for argstr in arg_filter:
argstr = ' '.join(argstr.split())
to_parse += argstr.split(' ')
if [a for a in kwlist if a in to_parse]:
for kw in to_parse:
if kw in kwlist:
x[kw] = to_parse[to_parse.index(kw)+1]
else:
raise
except:
logging.critical('option --filter must have at least one option')
sys.exit(-1)
return x
if options.filter is not None:
cfg.cfg["global_filters"].update(get_filter(options.filter))
try:
use_ssl = bool(cfg.cfg["elastic"]["use_ssl"])
except KeyError:
use_ssl = False
es = elasticsearch.Elasticsearch(cfg.cfg["elastic"]["host"], use_ssl=use_ssl)
# Get ES version from the client and avail it at cfg
es_version = es.info()['version'].get('number', None)
if es_version is not None:
cfg.cfg["elastic"]["version"] = es_version.split(".")[0]
if cfg.cfg["elastic"].get("version", None) is None:
print "Failed to get version from ES, Specify version ['1'/'2'/'5'] in [elasticsearch] section"
sys.exit(-1)
translate = NxTranslate(es, cfg)
if options.type_wl is True:
translate.wl_on_type()
sys.exit(0)
# whitelist generation options
if options.full_auto is True:
translate.load_cr_file(translate.cfg["naxsi"]["rules_path"])
results = translate.full_auto()
if results:
for result in results:
print "{0}".format(result)
else:
print "No hits for this filter."
sys.exit(1)
sys.exit(0)
if options.template is not None:
scoring = NxRating(cfg.cfg, es, translate)
tpls = translate.expand_tpl_path(options.template)
gstats = {}
if len(tpls) <= 0:
print "No template matching"
sys.exit(1)
# prepare statistics for global scope
scoring.refresh_scope('global', translate.tpl2esq(cfg.cfg["global_filters"]))
for tpl_f in tpls:
scoring.refresh_scope('rule', {})
scoring.refresh_scope('template', {})
print translate.grn.format("#Loading tpl '"+tpl_f+"'")
tpl = translate.load_tpl_file(tpl_f)
# prepare statistics for filter scope
scoring.refresh_scope('template', translate.tpl2esq(tpl))
#pprint.pprint(tpl)
print "Hits of template : "+str(scoring.get('template', 'total'))
whitelists = translate.gen_wl(tpl, rule={})
print str(len(whitelists))+" whitelists ..."
for genrule in whitelists:
#pprint.pprint(genrule)
scoring.refresh_scope('rule', genrule['rule'])
scores = scoring.check_rule_score(tpl)
if (len(scores['success']) > len(scores['warnings']) and scores['deny'] == False) or cfg.cfg["naxsi"]["strict"] == "false":
#print "?deny "+str(scores['deny'])
print translate.fancy_display(genrule, scores, tpl)
print translate.grn.format(translate.tpl2wl(genrule['rule'], tpl)).encode('utf-8')
sys.exit(0)
# tagging options
if options.wl_file is not None and options.server is None:
print translate.red.format("Cannot tag events in database without a server name !")
sys.exit(2)
if options.wl_file is not None:
wl_files = []
wl_files.extend(glob.glob(options.wl_file))
count = 0
for wlf in wl_files:
print translate.grn.format("#Loading tpl '"+wlf+"'")
try:
wlfd = open(wlf, "r")
except:
print translate.red.format("Unable to open wl file '"+wlf+"'")
sys.exit(-1)
for wl in wlfd:
[res, esq] = translate.wl2esq(wl)
if res is True:
count = 0
while True:
x = translate.tag_events(esq, "Whitelisted", tag=options.tag)
count += x
if x == 0:
break
print translate.grn.format(str(count)) + " items tagged ..."
count = 0
sys.exit(0)
if options.ips is not None:
ip_files = []
ip_files.extend(glob.glob(options.ips))
tpl = {}
count = 0
# esq = translate.tpl2esq(cfg.cfg["global_filters"])
for wlf in ip_files:
try:
wlfd = open(wlf, "r")
except:
print "Unable to open ip file '"+wlf+"'"
sys.exit(-1)
for wl in wlfd:
print "=>"+wl
tpl["ip"] = wl.strip('\n')
esq = translate.tpl2esq(tpl)
pprint.pprint(esq)
pprint.pprint(tpl)
count += translate.tag_events(esq, "BadIPS", tag=options.tag)
print translate.grn.format(str(count)) + " items to be tagged ..."
count = 0
sys.exit(0)
# statistics
if options.stats is True:
print translate.red.format("# Whitelist(ing) ratio :")
translate.fetch_top(cfg.cfg["global_filters"], "whitelisted", limit=2)
print translate.red.format("# Top servers :")
for e in translate.fetch_top(cfg.cfg["global_filters"], "server", limit=10):
try:
list_e = e.split()
print '# {0} {1} {2}{3}'.format(translate.grn.format(list_e[0]), list_e[1], list_e[2], list_e[3])
except:
print "--malformed--"
print translate.red.format("# Top URI(s) :")
for e in translate.fetch_top(cfg.cfg["global_filters"], "uri", limit=10):
try:
list_e = e.split()
print '# {0} {1} {2}{3}'.format(translate.grn.format(list_e[0]), list_e[1], list_e[2], list_e[3])
except:
print "--malformed--"
print translate.red.format("# Top Zone(s) :")
for e in translate.fetch_top(cfg.cfg["global_filters"], "zone", limit=10):
try:
list_e = e.split()
print '# {0} {1} {2}{3}'.format(translate.grn.format(list_e[0]), list_e[1], list_e[2], list_e[3])
except:
print "--malformed--"
print translate.red.format("# Top Peer(s) :")
for e in translate.fetch_top(cfg.cfg["global_filters"], "ip", limit=10):
try:
list_e = e.split()
print '# {0} {1} {2}{3}'.format(translate.grn.format(list_e[0]), list_e[1], list_e[2], list_e[3])
except:
print "--malformed--"
sys.exit(0)
def write_generated_wl(filename, results):
with open('/tmp/{0}'.format(filename), 'w') as wl_file:
for result in results:
for key, items in result.iteritems():
if items:
print "{} {}".format(key, items)
if key == 'genrule':
wl_file.write("# {}\n{}\n".format(key, items))
else:
wl_file.write("# {} {}\n".format(key, items))
wl_file.flush()
def ask_user_for_server_selection(editor, welcome_sentences, selection):
with tempfile.NamedTemporaryFile(suffix='.tmp') as temporary_file:
top_selection = translate.fetch_top(cfg.cfg["global_filters"],
selection,
limit=10
)
temporary_file.write(welcome_sentences)
for line in top_selection:
temporary_file.write('{0}\n'.format(line))
temporary_file.flush()
subprocess.call([editor, temporary_file.name])
temporary_file.seek(len(welcome_sentences))
ret = []
for line in temporary_file:
if not line.startswith('#'):
ret.append(line.strip().split()[0])
return ret
def ask_user_for_selection(editor, welcome_sentences, selection, servers):
regex_message = "# as in the --filter option you can add ? for regex\n"
ret = {}
for server in servers:
server_reminder = "server: {0}\n\n".format(server)
ret[server] = []
with tempfile.NamedTemporaryFile(suffix='.tmp') as temporary_file:
temporary_file.write(welcome_sentences + regex_message + server_reminder)
cfg.cfg["global_filters"]["server"] = server
top_selection = translate.fetch_top(cfg.cfg["global_filters"],
selection,
limit=10
)
for line in top_selection:
temporary_file.write('{0} {1}\n'.format(selection, line))
temporary_file.flush()
subprocess.call([editor, temporary_file.name])
temporary_file.seek(len(welcome_sentences) + len(server_reminder) + len(regex_message))
for line in temporary_file:
if not line.startswith('#'):
res = line.strip().split()
ret[server].append((res[0], res[1]))
return ret
def generate_wl(selection_dict):
for key, items in selection_dict.iteritems():
if not items:
return False
global_filters_context = cfg.cfg["global_filters"]
global_filters_context["server"] = key
for idx, (selection, item) in enumerate(items):
global_filters_context[selection] = item
translate.cfg["global_filters"] = global_filters_context
print 'generating wl with filters {0}'.format(global_filters_context)
wl_dict_list = []
res = translate.full_auto(wl_dict_list)
del global_filters_context[selection]
write_generated_wl(
"server_{0}_{1}.wl".format(
key,
idx if (selection == "uri") else "zone_{0}".format(item),
),
wl_dict_list
)
if options.int_gen is True:
editor = os.environ.get('EDITOR', 'vi')
welcome_sentences = '{0}\n{1}\n'.format(
'# all deleted line or starting with a # will be ignore',
'# if you want to use slack option you have to specify it on the command line options'
)
servers = ask_user_for_server_selection(editor, welcome_sentences, "server")
uris = ask_user_for_selection(editor, welcome_sentences, "uri", servers)
zones = ask_user_for_selection(editor, welcome_sentences, "zone", servers)
if uris:
generate_wl(uris)
if zones:
generate_wl(zones)
# in case the user let uri and zone files empty generate wl for all
# selected server(s)
if not uris and not zones:
for server in servers:
translate.cfg["global_filters"]["server"] = server
print 'generating with filters: {0}'.format(translate.cfg["global_filters"])
res = translate.full_auto()
writing_generated_wl("server_{0}.wl".format(server), res)
sys.exit(0)
# input options, only setup injector if one input option is present
if options.files_in is not None or options.fifo_in is not None or options.stdin is not None or options.syslog_in is not None:
if options.fifo_in is not None or options.syslog_in is not None:
injector = ESInject(es, cfg.cfg, auto_commit_limit=1)
else:
injector = ESInject(es, cfg.cfg)
parser = NxParser()
offset = time.timezone if (time.localtime().tm_isdst == 0) else time.altzone
offset = offset / 60 / 60 * -1
if offset < 0:
offset = str(-offset)
else:
offset = str(offset)
offset = offset.zfill(2)
parser.out_date_format = "%Y-%m-%dT%H:%M:%S+"+offset #ES-friendly
try:
geoloc = NxGeoLoc(cfg.cfg)
except:
print "Unable to get GeoIP"
if options.files_in is not None:
reader = NxReader(macquire, lglob=[options.files_in])
reader.read_files()
injector.stop()
sys.exit(0)
if options.fifo_in is not None:
fd = open_fifo(options.fifo_in)
if options.infinite_flag is True:
reader = NxReader(macquire, fd=fd, stdin_timeout=None)
else:
reader = NxReader(macquire, fd=fd)
while True:
print "start-",
if reader.read_files() == False:
break
print "stop"
print 'End of fifo input...'
injector.stop()
sys.exit(0)
if options.syslog_in is not None:
sysloghost = cfg.cfg["syslogd"]["host"]
syslogport = cfg.cfg["syslogd"]["port"]
while 1:
reader = NxReader(macquire, syslog=True, syslogport=syslogport, sysloghost=sysloghost)
reader.read_files()
injector.stop()
sys.exit(0)
if options.stdin is True:
if options.infinite_flag:
reader = NxReader(macquire, lglob=[], fd=sys.stdin, stdin_timeout=None)
else:
reader = NxReader(macquire, lglob=[], fd=sys.stdin)
while True:
print "start-",
if reader.read_files() == False:
break
print "stop"
print 'End of stdin input...'
injector.stop()
sys.exit(0)
opt.print_help()
sys.exit(0)

View File

@@ -0,0 +1 @@
elasticsearch

View File

@@ -0,0 +1,36 @@
#!/usr/bin/env python
from distutils.core import setup
import os
import glob
import pprint
f = {}
data_files = [('/usr/local/nxapi/', ['nx_datas/country2coords.txt']),
('/usr/local/etc/', ['nxapi.json'])]
#modules = []
for dirname, dirnames, filenames in os.walk('tpl/'):
for filename in filenames:
if filename.endswith(".tpl"):
print dirname+"#"+filename
if "/usr/local/nxapi/"+dirname not in f.keys():
f["/usr/local/nxapi/"+dirname] = []
f["/usr/local/nxapi/"+dirname].append(os.path.join(dirname, filename))
for z in f.keys():
data_files.append( (z, f[z]))
setup(name='nxtool',
version='1.0',
description='Naxsi log parser, whitelist & report generator',
author='Naxsi Dev Team',
author_email='thibault.koechlin@nbs-system.com',
url='http://github.com/nbs-system/naxsi',
scripts=['nxtool.py'],
packages=['nxapi'],
data_files=data_files
)

View File

@@ -0,0 +1,11 @@
{
"var_name" : "__utmz",
"id" : "1009 or 1010 or 1005 or 1011",
"zone" : "ARGS",
"_statics" : {
"id" : "1009,1010,1005,1011"
},
"_msg" : "google analytics, __utmz var in ARGS"
}

View File

@@ -0,0 +1,8 @@
{
"_msg" : "A generic, precise wl tpl (url+var+id)",
"zone" : "ARGS",
"var_name" : "?",
"id" : "?",
"uri" : "?",
"_warnings" : { "template_uri" : [ ">=", "5"]}
}

View File

@@ -0,0 +1,11 @@
{
"_msg" : "A generic, wide (id+zone) wl",
"_success" : { "template_uri" : [ ">", "5"],
"rule_uri" : [ ">", "5"]},
"_warnings" : { "rule_var_name" : [ "<=", "5" ],
"rule_uri" : [ "<=", "5" ] },
"_deny" : { "rule_var_name" : [ "<", "10" ] },
"zone" : "ARGS",
"id" : "?"
}

View File

@@ -0,0 +1,8 @@
{
"_msg" : "A generic whitelist, true for the whole uri",
"zone" : "ARGS|NAME",
"uri" : "?",
"id" : "?",
"_warnings" : { "template_uri" : [ ">", "5" ] },
"_success" : { "rule_var_name" : [ ">", "5" ] }
}

View File

@@ -0,0 +1,8 @@
{
"_msg" : "A generic whitelist, true for the whole uri",
"zone" : "ARGS",
"uri" : "?",
"id" : "?",
"_deny" : { "rule_var_name" : [ "<=", "3" ]},
"_success" : { "rule_var_name" : [ ">", "3" ]}
}

View File

@@ -0,0 +1,9 @@
{
"_msg" : "A generic, precise wl tpl (url+var+id)",
"zone" : "BODY",
"var_name" : "?",
"id" : "?",
"uri" : "?",
"_warnings" : { "template_uri" : [ "<", "5"],
"template_var_name" : [ "<", "5"]}
}

View File

@@ -0,0 +1,8 @@
{
"_msg" : "A generic, wide (id+zone) wl",
"zone" : "BODY",
"id" : "?",
"_success" : { "template_uri" : [ ">", "5"],
"template_var_name" : [ ">", "5"]},
"_deny" : { "rule_var_name" : [ "<", "10" ] }
}

View File

@@ -0,0 +1,7 @@
{
"_msg" : "A generic whitelist, true for the whole uri, BODY|NAME",
"zone" : "BODY|NAME",
"uri" : "?",
"id" : "?",
"_warnings" : { "template_uri" : [ ">", "5"] }
}

View File

@@ -0,0 +1,8 @@
{
"_warnings" : { "template_uri" : [ ">", "5"]},
"_deny" : {"rule_var_name" : ["<", "5"]},
"_msg" : "A generic whitelist, true for the whole uri",
"zone" : "BODY",
"uri" : "?",
"id" : "?"
}

View File

@@ -0,0 +1,8 @@
{
"zone" : "BODY",
"var_name" : "?",
"id" : "?",
"_msg" : "A generic rule to spot var-name specific WL",
"_success" : { "rule_uri" : [ ">", "2"]},
"_deny" : { "rule_uri" : ["<", "2"]}
}

View File

@@ -0,0 +1,5 @@
{
"_success" : { "template_uri" : [ ">=", "5"] },
"zone" : "HEADERS",
"var_name" : "cookie",
"id" : "?"}

View File

@@ -0,0 +1,6 @@
{
"id" : "1002",
"zone" : "URL",
"_success" : { "template_uri" : [ ">", "5"],
"rule_uri" : [ ">", "5"]}
}

View File

@@ -0,0 +1,7 @@
{
"_success" : { "template_uri" : [ ">=", "5"],
"rule_uri" : [ ">=", "5"]},
"zone" : "URL",
"id" : "?"
}

View File

@@ -0,0 +1,6 @@
{
"_deny" : { "template_uri" : [ ">", "5" ] },
"uri" : "?",
"zone" : "URL",
"id" : "?"
}