When I review my server logs, I see many user agent strings that are not from the browser, such as:
and so. Compared to the limited number of browser user agent strings, these non-browser user agent strings are extremely numerous and do not follow any unifying naming principle.
I want to analyze these user agent strings that are not from the browser and, to do so, I think the best approach is to identify and exclude browser user agent strings, which seem to have certain "tags" in common.
For example, Firefox will be identified as "Mozilla", for example:
Mozilla / 5.0 (Macintosh, Intel Mac OS X 10.13, rv: 63.0) Gecko / 20100101 Firefox / 63.0
and Opera will be identified as "Opera", for example:
Opera / 9.20 (Windows 95; en-US) Presto / 2.12.333 version / 12.00
Therefore, if I delete all entries from my records that contain "Mozilla" or "Opera" or another browser tag, I will have all user agent strings that are not from the browser.
Now there are many other browser user agent strings, and some of them do not contain "Mozilla" or "Opera". I think I've identified the following browser tags from my own records:
- Apple TV
But I can confuse some of these as browsers, and I probably will have lost others that did not visit my sites. So:
Are they really all browser user agent tags, or are some of them other applications? And what browser user agent tags have I forgotten?
I understand that user agents can be forged. I understand that a bot can use the user agent of a browser and that a browser can use a user agent string that is not a browser. Please, do not advise me not to use this method. You do not know what I want to do. Thank you.