i have processes a log file for one web site contain 127 Web pages, and applying data cleaning and user identification process, instead of applying frequent item set or Apriori algorithm i would like to discover similar users based on browsing behavior towards these 127 Web pages?
time stamp Page-visit
user1 time A1
user1 time A2
user2 time A3
user2 time A1
user3 time A4
user1 time A26
user1 time A100
user2 time A1
user1 time A14
user4 time A88
i have browsing some methods for cluster analysis in log file i have build the following matrix for my users:
So, i think this matrix build over all log file .. my log file period one month so when consider this matrix how about frequency variables for each columns.. its required to converted to some other values?
is the procedure correct for above matrix when calculated over one month for all users?
How do we got similar users and what is similarity measure can be used for this kind of issue?
Aucun commentaire:
Enregistrer un commentaire