Abstract: This paper proposed an approach of malicious URL detection using trigrams-based common pattern of URL, which implanted with random domain recognition, named MIRD. In MIRD the common patterns were composed of three segments common patterns of URL, namely domain segment, path name segment and file name segment. An inverted index based on trigrams was used to improve common pattern extraction of each segment. MIRD used the common patterns based on inverted index to match with the detected URL. Moreover, MIRD implanted with Random Domain Name Recognition Module, named RDM. The RDM identi-fied the length of the domain name and resolved the domain name in iteration to recognize the domain name unresolved, reducing the cumulative error rate of malicious URL detection. Ex-tensive experiments showed that the MIRD is efficient and scalable.
Keywords: Malicious URL Detection, Common Pattern, Trigram, Inverted Index, Radom Do-main Name