Go4Expert

Go4Expert (http://www.go4expert.com/)
-   Database (http://www.go4expert.com/forums/database-forum/)
-   -   Help with parsing mysql db data (http://www.go4expert.com/forums/help-parsing-mysql-db-data-t18659/)

sachinb 22Jul2009 23:56

Help with parsing mysql db data
 
This is my first post on this forum. Hopefully someone can help me. I am trying to get a count in a dbtable of a tree-like structure. The way it works is: I have three fields: member id, left node id, right node id.
So, table would look like
member_id | leftnode_id | rightnode_id
101 | 102 | 103
102 | 104 | 105
103 | 106 | 0
104 | 107 | 108
105 | 109 | 0
106 | 0 | 0
107 | 110 | 0
108 | 0 | 0
109 | 0 | 0

I'm trying to parse the db such that for every member_id, it gives me a left node count, i.e. how many nodes are on the left side. for example member 101's left node count will include total node count of member 102 that is anything below, and right node count will have node count of member 103.

I tried using sql parsing and creating the tree on the fly, but the script fails as the numbers of members increases.

Is it possible to find the count in reverse order, where the db first goes thru the last member_id, and generates the node count in reverse.
In this case, it would go to member 109 and in leftnode count put 0 and right node count put 0, same with 108 and 106.
For member 105, the left node count will be 1 which is just node 109 and right node count will be 0.
For member 104, the left node count will be 1 + node count of 110 (1) = 2 and right node count will be 1.
For member 103, the left node count will be 1.
For member 102, left node count will be 1 + 2 (node count of 104) = 3 and right node count will be 2 (member 105 + node count of 105).
For member 101, left node count will be 1 (member 102) + 5 (total node count of member 102) = 6, right node count will be 2 (member 103 + node count of 103)

any thoughts? any help will be appreciated.

Sachin

sachinb 28Jul2009 02:49

Re: Help with parsing mysql db data
 
no one has any thoughts on this? I just want to know what would be the best way to start it off. would a perl script be better?

xpi0t0s 28Jul2009 03:00

Re: Help with parsing mysql db data
 
I don't know if a query can be constructed that will return the results you want.

It's a fairly easy problem to solve using some kind of procedural solution; for each node, work out the number of children, recursively (or iteratively) until you have a final solution: 102 (say) has 2 plus the number of children 104 and 105 have; 104 has 2+(107 and 108's children), and so on until all outstanding "N's children" have been resolved.

sachinb 28Jul2009 05:32

Re: Help with parsing mysql db data
 
Quote:

Originally Posted by xpi0t0s (Post 53687)
I don't know if a query can be constructed that will return the results you want.

It's a fairly easy problem to solve using some kind of procedural solution; for each node, work out the number of children, recursively (or iteratively) until you have a final solution: 102 (say) has 2 plus the number of children 104 and 105 have; 104 has 2+(107 and 108's children), and so on until all outstanding "N's children" have been resolved.

Thanks for the response. We tried that, but when it is done in real time, the system crashes after it reaches 15 levels.

So we wanted to find a way where instead of real-time if a procedure can be run in the background, or a cron every night that would go through the member list and generate the total node count for each member. We played with sql procedures, but are wondering if perl would be better performance-wise if going through a list of 1000 or more members.

xpi0t0s 28Jul2009 11:54

Re: Help with parsing mysql db data
 
A crash means you have a programming error which you need to find and fix. This doesn't mean your approach is wrong. Maybe it ran out of stack.
If that's the case and you can't increase the stack, try an iterative solution instead. For example, first initialise an array as long as the number of rows in the table to -1 (represented as a 4th column):
101 | 102 | 103 | -1
102 | 104 | 105 | -1
103 | 106 | 0 | -1
104 | 107 | 108 | -1
105 | 109 | 0 | -1
106 | 0 | 0 | -1
107 | 110 | 0 | -1
108 | 0 | 0 | -1
109 | 0 | 0 | -1

Scan through the table and where number of children is known, fill it in.
So we don't know for 101, because rows 102 and 103 are -1.
Same for 102, 103, 104, 105.
106 has no children so set the -1 to 0.
Not sure about 107 because there isn't a 110 in the table.
108 has no children so set the -1 to 0, ditto 109 (and maybe 110 as I imagine at this stage you also have 110 | 0 | 0 [| -1]). So we now have:
101 | 102 | 103 | -1
102 | 104 | 105 | -1
103 | 106 | 0 | -1
104 | 107 | 108 | -1
105 | 109 | 0 | -1
106 | 0 | 0 | 0
107 | 110 | 0 | -1
108 | 0 | 0 | 0
109 | 0 | 0 | 0
110 | 0 | 0 | 0

We made changes so start again from the top of the table.
101: lookup the 4th column for 102,103, both are -1, can't tell, move onto the next.
102: lookup 104,105, ditto
103: aha, 106 has no children so we can tell now that 103 has precisely 1 child.
104: 108 is resolved but 107 isn't so we can't tell yet.
105: has child 109 which is resolved -> 1
107: skip this because column 4!=-1

(you can do the rest) At the end of this iteration we should have:
101 | 102 | 103 | -1
102 | 104 | 105 | -1
103 | 106 | 0 | 1
104 | 107 | 108 | -1
105 | 109 | 0 | 1
106 | 0 | 0 | 0
107 | 110 | 0 | 1
108 | 0 | 0 | 0
109 | 0 | 0 | 0
110 | 0 | 0 | 0

Only 3 -1's to resolve in the next iteration which should end:
101 | 102 | 103 | -1
102 | 104 | 105 | -1
103 | 106 | 0 | 1
104 | 107 | 108 | 3
105 | 109 | 0 | 1
106 | 0 | 0 | 0
107 | 110 | 0 | 1
108 | 0 | 0 | 0
109 | 0 | 0 | 0
110 | 0 | 0 | 0

and the final result should be:
101 | 102 | 103 | 9
102 | 104 | 105 | 6
103 | 106 | 0 | 1
104 | 107 | 108 | 3
105 | 109 | 0 | 1
106 | 0 | 0 | 0
107 | 110 | 0 | 1
108 | 0 | 0 | 0
109 | 0 | 0 | 0
110 | 0 | 0 | 0

So the pseudocode for this will be something like:
Code:

changes_made=1
while (changes_made)
{
  changes_made=0
  for each row
  {
    if col4==-1
    {
      lookup col4 in row corresponding to left node
      lookup col4 in row corresponding to right node
      if both are not -1
      {
        col4=1 if left node is !0 +
              1 if right node is !0 +
              col4 from left node's col4 +
              col4 from right node's col4
        changes_made=1
      }
    }
  }
}

If you store these intermediate results in the table then you would need to make sure that whenever a row was modified that the dependent rows were also updated. So let's add a node 111 to 108:
101 | 102 | 103 | 9 !!
102 | 104 | 105 | 6 !!
103 | 106 | 0 | 1
104 | 107 | 108 | 3 !!
105 | 109 | 0 | 1
106 | 0 | 0 | 0
107 | 110 | 0 | 1
108 | 111 | 0 | 0
109 | 0 | 0 | 0
110 | 0 | 0 | 0
111 | 0 | 0 | 0
So we could increase 108's col4 by 1 at this point but we need then to recalculate the results for any dependent rows. You could do this by working backwards: 104 contains a reference to 108 so set col4=-1. 102 refers to 104 so set that to -1, and 101 to 102:
101 | 102 | 103 | -1
102 | 104 | 105 | -1
103 | 106 | 0 | 1
104 | 107 | 108 | -1
105 | 109 | 0 | 1
106 | 0 | 0 | 0
107 | 110 | 0 | 1
108 | 111 | 0 | 0
109 | 0 | 0 | 0
110 | 0 | 0 | 0
111 | 0 | 0 | 0
then rerun the iteration to resolve the -1's. Or you could recalculate as you work backwards, whichever is easier.

sachinb 28Jul2009 19:31

Re: Help with parsing mysql db data
 
Thank you so much for the detailed explanation tutorial. I will try it out and let you know the results. I really appreciate it because our code was working till the tree reaches level 15, but then it bombs out. I will try to implement the fourth column with -1, maybe that was the missing part. Thanks once again!


All times are GMT +5.5. The time now is 17:20.