Domain Name System (DNS) is indispensable in a large number of network applications. Identifying DNS infrastructures into different roles hierarchically is highly desired for a variety of purposes such as network management and threat evaluation. However, traditional measurements almost all depend on active scanning without considering dynamic packet-level features of different DNS infrastructures.
In this paper, we propose a high-performance model IDNS (Identifying DNS) based on passive measurement. IDNS: (i) extracts single-packet field features (SFF) and multi-packet statistical features (MSF) from DNS traffic, (ii) utilizes an estimation algorithm to calculate MSF for satisfying online processing speed, (iii) applies several classifiers in Ensemble Learning and Incremental Learning. We perform an extensive evaluation based on a large volume of DNS queries and responses collected from one ISP. The evaluation results demonstrate that the best classifiers in Ensemble Learning can reach 90% accuracy rate while the classifier in Incremental Learning can reach 80% with the highest scalability.
(Domain, IP) tuple list for users, (ii)utilizes a multi-protocol cross validation method to verify suspicious (Domain, IP) tuples, (iii) applies self-feedback mechanism to calculate the correctness probabilities of (Domain, IP) iteratively.
We show that in real circumstance for two weeks, SFDS can find almost 1300 correct (Domain, IP) tuples for one domain on average in one day. And SFDS is effective with accuracy approximately 100% by our experiments.