Donkey Kong Forum
General Donkey Kong Discussion => General Donkey Kong Discussion => Topic started by: Jeffw on October 05, 2013, 11:16:30 am
-
I might be doing some work on the Pauline pace program in the coming months, so I'm creating this thread for people to talk things such as features that should be added to it, bugs and issues with the existing version, or other suggestions for improvement.
In this poll you can select which features you would be most interested in seeing. A more detailed explanation of the features are below. Note that the options marked "probably feasible" and "maybe feasible" are not guaranteed to work out. You can select 5 options (I would have allowed more, but this poll works a little differently than I thought it would and I don't think I can change it anymore). Note that all that matters is the relative difference between votes that each gets, so checking all of them (if it were allowed) would be equivalent to checking none of them. I don't guarantee that I will work on whatever gets the most votes, I'm basically just using this poll to get an idea of where the most interest lies. Feel free to suggestion other options not on the poll in this thread.
There are a few things I plan on working on no matter what, and will put as higher priority than any of the features in the poll, and so they are not included in the poll:
- I will add a mode in which you can manually enter scores as you complete levels. This will allow arcade players to use Pauline. It will likely involve just having to enter 4 digits representing your score after every screen.
- I understand that Pauline still doesn't work properly for some people. I want it to work for everyone so I plan improving the robustness of the OCR by training the OCR engine to recognize specifically the numbers in DK.
Here is a more detailed description of each option in the list.
Voice Recognition
This was suggested by some people as an improved method of score input than manually typing the scores. This will likely involve saying some command followed by saying your score after each screen, and also after deaths. I think current voice recognition is robust enough for this to work out, but there is still some risk. Note that I will definitely support manual keyboard score entry so you should select this option if you think voice recognition is a worthwhile improvement over entering 4 digits after every screen.
Automation for Arcade
I've had some ideas on how to get pace tracking working automatically for arcade just like it does on mame. One example would be having the user manually set a region where the score is located along with maybe another small region for recognizing which board is being played, and then see if it can read the score and recognize the board. Probably the main challenge would be reading the score since the numbers might not be totally clear like they are in mame. Maybe with image pre-processing and/or training the OCR engine I can get it to consistently read the score correctly, but maybe not.
Dumping Stats
This would be the capability of dumping a stats file after a game containing information like the date and time of the game, final score, the score on every board, averages, paces, etc. The information would be very similar to that in the DK Data Library of Memorable Games thread (it would actually greatly speed up the process of collecting those stats for mame games by simply running Pauline while an inp plays in fast-forward). In the future I could even add stuff like an ability to query archived stats files for aggregate data over long periods of time, such as querying for a rivet average over all games in the past month.
Fullscreen Mode
Some people really don't like playing in windowed mode, so I could support running Pauline with a game in fullscreen mode. Running this way would make Pauline no longer visible, but I assume you would still be able to stream Pauline's window. If I was to do this I would probably just add support to track games in arbitrary screen regions rather than just the current window that is in focus.
Improved Pace Algorithm
The current pace algorithm uses essentially the equation pace = start_score + level_average*17 (although it's a little more complex than that). One problem with this is that you can't get a pace until you reach level 6, and pace at that point could be extremely inaccurate if the level 5 score wasn't reflective of what you normally score. Another problem is that if in game A a player is on the same level as in game B and has the same number of spare men but has a higher score in game A, then the pace in game A should be higher at that point in time than in game B since the player in game A is clearly ahead, but the naive pace equation doesn't guarantee this (i.e., 130k start + 58k level 5 will result in a significantly lower pace than 120k start + 60k level 5, even though in the first game the player is 8k ahead). I have an algorithm in mind that fixes this and also allows for a pace to be shown before level 6.
Graphical User Interface
This is pretty self-explanatory, instead of the interface being just a resized terminal window I could make an actual GUI with menus and such.
-
Improved Pace Algorithm
Another problem is that if in game A a player is on the same level as in game B and has the same number of spare men but has a higher score in game A, then the pace in game A should be higher at that point in time than in game B since the player in game A is clearly ahead, but the naive pace equation doesn't guarantee this (i.e., 130k start + 58k level 5 will result in a significantly lower pace than 120k start + 60k level 5, even though in the first game the player is 8k ahead).
Jeff, I'm not sure that this is a problem. Suppose you have 2 hypothetical players, one named Phil and one named Hank. Phil likes to go for monster starts, but starting level 5, he gets bored, runs boards and puts out poser levels. On the other hand, Hank hates restarting, so he goes for poser starts and then tries to make up for it with monster levels. Now Phil's score maybe be higher than Hank's at the end of L5, but by the end of the game Hank may have caught up to Phil because there are 17 * L5+ levels.
BTW, I voted for the GUI and arcade automation. That would be awesome.
-
an automated program for arcade would be Awesome!
id prefer not to have future deaths estimated in the pace calculation, i believe its programmed for 5k atm? would be cool if we could set it to 0 or whatever number we wanted...
im often curious what my previous level score was, you can get a good idea from min max and avg but it would be cool to have it displayed as well...
cheers Jeff, paulines a great little program...
8)
-
This is interesting news Jeff! Looking forward to seeing what you come up with.
Keep in mind that some of us still use ancient computers to play MAME and if any of these features will significantly eat more cpu, it could potentially cause lag and become unuseable. It may be a good idea to have such improvements included as options that can be turned off by the user. Of course, for arcade players who will be running Pauline on a separate platform as their arcade game (obviously), these would likely all be very welcome improvements!
I'm interested to hear more about your ideas for a new and improved pace algorithm. I get the sense that instead of using the "start" score as a separate entity to add into the equation, you'll be trying to treat each of the beginning screens in the same manner as the L5+ screens, and perhaps weighting them differently due to the difference in gameplay and potential scores that occur with the differences in internal difficulties and Timer speeds and values. I feel that there are enough differences in these screens that trying to do this will automatically cause a small amount of inaccuracy, although with enough thought I'm sure that the proper pace on these screens could be very closely approximated. This might be another option to consider making user definable / configurable. That way, you could potentially have even more than 2 algorithms to choose from -- changing something as small as the death estimates, for example, to what that user feels is the most common for them.
Trying to correctly factor in each of the first 14 screens individually creates some interesting math problems. For example, the Level 2 rivets is clearly potentially higher scoring than a Level 5 rivet screen despite the fact that it has a lower starting Timer value, lower scoring prizes and slower firefoxes. Level 1 barrels is clearly lower scoring than Level 5 barrels, but by how much? The two hammers are likely to be slightly less efficient (due to more widely spaced barrels and less steerability), but will be pretty close compared to the overall number of available barrels released (50 vs. 80). Fireball speeds are another factor. So, I don't think you can just compare the difference in the Timer and assume that you should score 3000 points less on 1-1 than on 5-1. The difference is probably closer to 4000 because of these other factors.
Anyways, I'm curious about how you plan to calculate pace on a screen by screen (or Level by Level) basis during the first 14 screens. I think that could be an interesting discussion.
-
I've seen the program of course but never really had a chance to use it in my game.
I like the idea of a user input, not sure how voice recognition will work with basic talking going on during the game anyway. But, for user input, given that there are times where the input can be missed, can the program work without having all fields filled in for each board?
Some neat things to see may be:
Number of boards remaining, or a progress bar from start to 22-1, to see exactly where you are.
If you can keep Historical Data:
-Personal Best
-Highest pace achieved
-Average score
-Average Pace
-What board deaths occur on by percentage
-
I like all the ideas, particularly the full screen mame use and Arcade use. One thing the Ethan and I had discussed is to come up with a rating for a game. For example, if we could keep track of how many free passes occur on conveyer which could be a simple as adding a count if the timer is a certain amount at the time of completion, such as a free pass = 6800 to 6400 or something like that. This could help measure the difficulty of the randomness on those stages. Knowing how many pace killing screens from challenging situations would be nice to know which would be easy to add a counter to low scoring Rivets where the board is completed with very little to no bonus timer left. If we could have some stats like these we could actually begin to rate one million game over another, since not all games are created equal. Perhaps that is too much to think about but every little bit helps if it is not that much trouble, and I am sure you could even think of some other ways for your program to assist in quick rating calculations which could take PB and other factors into account which one can add and store into the program which it will use in its calculations.
-
Improved Pace Algorithm
Another problem is that if in game A a player is on the same level as in game B and has the same number of spare men but has a higher score in game A, then the pace in game A should be higher at that point in time than in game B since the player in game A is clearly ahead, but the naive pace equation doesn't guarantee this (i.e., 130k start + 58k level 5 will result in a significantly lower pace than 120k start + 60k level 5, even though in the first game the player is 8k ahead).
Jeff, I'm not sure that this is a problem. Suppose you have 2 hypothetical players, one named Phil and one named Hank. Phil likes to go for monster starts, but starting level 5, he gets bored, runs boards and puts out poser levels. On the other hand, Hank hates restarting, so he goes for poser starts and then tries to make up for it with monster levels. Now Phil's score maybe be higher than Hank's at the end of L5, but by the end of the game Hank may have caught up to Phil because there are 17 * L5+ levels.
BTW, I voted for the GUI and arcade automation. That would be awesome.
I've actually talked about this before, see this post: https://donkeykongforum.net/index.php?topic=56.msg475#msg475 (https://donkeykongforum.net/index.php?topic=56.msg475#msg475)
Basically, for the situation you described I would argue that it is more correct to give Phil a high pace to start off and then lower it as he changes his strategy and runs more and more boards without point pressing, while Hank's pace should start off low since he didn't point press at all on the first few levels and it should go up as he adjusts his strategy to start point pressing. I would leave the old pace algorithm as an option to use in case that is what you would prefer.
I'm interested to hear more about your ideas for a new and improved pace algorithm. I get the sense that instead of using the "start" score as a separate entity to add into the equation, you'll be trying to treat each of the beginning screens in the same manner as the L5+ screens, and perhaps weighting them differently due to the difference in gameplay and potential scores that occur with the differences in internal difficulties and Timer speeds and values. I feel that there are enough differences in these screens that trying to do this will automatically cause a small amount of inaccuracy, although with enough thought I'm sure that the proper pace on these screens could be very closely approximated. This might be another option to consider making user definable / configurable. That way, you could potentially have even more than 2 algorithms to choose from -- changing something as small as the death estimates, for example, to what that user feels is the most common for them.
I discussed the general idea of the algorithm here (I started working on something for this a while back): https://donkeykongforum.net/index.php?topic=56.msg464#msg464 (https://donkeykongforum.net/index.php?topic=56.msg464#msg464)
-
Dumping game stats into a database that can be analyzed either in the Pauline program or exported into some other format would be awesome.
Being able to configure the program is something Dean touched on, I think, and I second that idea. It would be great to be able to set up custom pace goals. For example, if I want to go for a 120k start, 58k levels from 5 to 12, and then 55k levels to the end, and only figure in 6k for deaths. Or, maybe I want to factor in fatigue and say that my expected pace goal is 120k start, and 58k levels through level 17, then only 55k levels to the end, and I want to expect a worst-case scenario and factor in 0 points for all deaths. So, being able to tweak the expectations for the start (and maybe the levels within the start), for every level (maybe even every board type?), and deaths would be great.
Also, being able to grab other information from the screen would be cool. Is it possible, for example, to record the value for all hammer smashes (especially blue smashes), the remaining Bonus time, and/or the length/type of springs?
Thanks Jeff!
-
I could probably knock up an Android App fairly easily, might be useful for arcade players.
-
I think it would be very cool to have a pace calculator for each individual level. Like when you have the bottom hammer on a barrel screen, Pauline calculates the pace of the screen, etc. And also a stage pace.
It would be pretty cool to know the pace especially for a barrel screen throughout the whole level, especially on 1-1!
-
Just some thoughts about the pace algorithm. And I'll try and keep this as short as possible :)
Method 1:
In Australia, we worship a sport known as Cricket - I'm guessing a lot of the DK (mainly US of A) players wouldn't have ever watched it. But basically, it's a game where there is a one day version (full game is 5 days). A method was devised to provide a result for the one day variation of the game when bad weather stopped the game prior to a result being achieved. This is know by the name of the creators - Duckworth & Lewis - so the Duckworth-Lewis method. A more detail explanation of the mathematics can be found here ---> http://en.wikipedia.org/wiki/Duckworth%E2%80%93Lewis_method (http://en.wikipedia.org/wiki/Duckworth%E2%80%93Lewis_method)
In essence, this system looks at current score, available resources left (players yet to bat) and devises a score based on historical data of how teams performed in the same situation.
If the database shows 1000 games where players have lost 2 lives out of their 4, and their current score is 100,000 - it would provide an average score based on scores in the database where others have been at this same point.
So pretty much, this system would become more and more accurate as more and more people use the program and submit their results to the database. Granted, it's a hell of a lot more work to get such a program working. The only problem with this approach is that the people most likely to use Pauline are likely to be elite killscreen capable players, so it will report false positives for the lower (like me) ranked players. That is to say, Dean, Hank, Vincent etc. could lose 3 lives on 1-1 and still kick my arse. The only way to get it more accurate would be to have every chump use it.
Method 2:
Why not just program in the scores after 1-1, 1-2, 2-1, 2-2, 2-3 etc. for Dean's 1.2m score? If after 1-1 his score was 10,100 (or whatever it was) and your score is 9,500 - then your pace is 9,500 / 10,1000 * 1.2m = you are on 1.128 Million pace (hope my math is right there) - sure after 1-1 it's not going to be completely accurate, but the longer you play, the more accurate the final predicted score will be. The final "reported" paced will jump up and down dramatically early on, but I'm guessing by L5+ it will settle down and only jump a little bit each way.
-
Personally, I don't think it makes sense to attempt a pace calculation prior to the completion of Level 05. In fact, I would argue that pace isn't particularly meaningful until the end of Level 08 (at the earliest).
As for method, I'm in favor of a least squares regression line (previously suggested by Hank). Using this scheme, the "start" defines the y-intercept of the "best fit" line, while the Level 05 (and beyond) results determine the slope.
-
This probably is not relevant but I have been keeping track of my own games by a comparison to one million pace. For example, if I am at 695,000 at the start of level 16, I know that I could get 1.010M points if I only pull 52,500 points per level. I then think of each level in terms of the +/-. For example, if I started level 17 with 749,500 from my previous level 16 start score then I was +2K for that level and I am now sitting at getting 1.012M if I only pull 52,500 per level. I don't know if anyone else keeps track of their developing score this way but I have come to appreciate it.
-
Just thought of something while reading through everyone's suggestions. Now I admit this would be a small project, but you could have Pauline keep track of your historical board scores and learn your playing habits. This is much like how my GPS (at least it claims to) learn my driving habits and give me a better estimate of my ETA. Pauline would then know if you just happen to get a lucky 1-1 or lucky L5 and just add a few 1000 points and not overestimate the pace. It could also figure out if you're improving as well (such as if your level averages or elevators are improving) and reward you with a sultry message.
-
If the database shows 1000 games where players have lost 2 lives out of their 4, and their current score is 100,000 - it would provide an average score based on scores in the database where others have been at this same point.
So pretty much, this system would become more and more accurate as more and more people use the program and submit their results to the database. Granted, it's a hell of a lot more work to get such a program working. The only problem with this approach is that the people most likely to use Pauline are likely to be elite killscreen capable players, so it will report false positives for the lower (like me) ranked players. That is to say, Dean, Hank, Vincent etc. could lose 3 lives on 1-1 and still kick my arse. The only way to get it more accurate would be to have every chump use it.
That's an interesting idea but it's kind of the opposite direction I was thinking of moving towards. Basically, I want pace to be thought of less as a prediction of what the final score will be and more as a pure unbiased status. On an intuitive level what I am trying to achieve is this: given that the player plays with exactly the same level of aggression/point-pressing that they have for the entire game so far, what will their final score be? The current pace algorithm instead achieves this: given that the player plays with exactly the same level of point pressing that they have since level 5, what will their final score be? If you want to get an accurate prediction of what the final score will be you can have a something separate that does all sorts of fancy stuff like looking at historical data and databases of games of other players like you are suggesting. Here's an example just to illustrate the difference between "status" and "prediction". Suppose a player gets 13k on 1-1, the pace algorithm I am suggesting would output a monster pace of 1.3m+. Note that this is a terrible prediction of final score since nobody will be able to get 1.3m, but that's okay since pace is a status not a prediction. An accurate prediction on the other hand would take into account historical data and output around 1m if this was a 1m-caliber player, around 1.1m if it was a 1.1m-caliber player and around 1.2m if Dean was playing. and in fact the actual score on the first screen wouldn't have too much on an impact on this prediction.
Why not just program in the scores after 1-1, 1-2, 2-1, 2-2, 2-3 etc. for Dean's 1.2m score? If after 1-1 his score was 10,100 (or whatever it was) and your score is 9,500 - then your pace is 9,500 / 10,1000 * 1.2m = you are on 1.128 Million pace (hope my math is right there) - sure after 1-1 it's not going to be completely accurate, but the longer you play, the more accurate the final predicted score will be. The final "reported" paced will jump up and down dramatically early on, but I'm guessing by L5+ it will settle down and only jump a little bit each way.
I guess this is sort of similar to the algorithm I had in mind, except I wouldn't base pre-programmed values on a single game (the score on a particular level in that game might be very atypical) and I would also have multiple sets of pre-programmed values for various paces and interpolate between them.
Personally, I don't think it makes sense to attempt a pace calculation prior to the completion of Level 05. In fact, I would argue that pace isn't particularly meaningful until the end of Level 08 (at the earliest).
I would argue that it does have some value. For example, a viewer that doesn't know a lot about DK could get an idea of how good a particular score at the end of L2 is if a pace is shown at this time.
As for method, I'm in favor of a least squares regression line (previously suggested by Hank). Using this scheme, the "start" defines the y-intercept of the "best fit" line, while the Level 05 (and beyond) results determine the slope.
Linear regression doesn't really make sense to use in this context. It should be used instead in situations where you are comparing two different variables. In this case there is only one variable, the score on a particular level, and although it can look like there are two if you plot the score after each level with the score on the y-axis and the level on the x-axis, but the level is not a variable and just a counter that increments by 1. Here's an example of how linear regression can fail significantly: suppose a player gets a 130k start, then gets 60k for every level from 5 to 19, then on level 20 they switch to no point pressing and score 47k on level 20, with linear regression that final data point would barely offset the line of best fit at all and the pace calculation would still be about 1.15m, even though getting that score would require a massive 73k on L21.
-
Thanks for the explanations and points of view Jeff.
I think I might have a quick go and writing a very basic "Pauline inspired" Android app though. I'll release details and a link if I get it going. Unfortunately I can't do it for iOS atm, as Apple want to charge a ridiculous price for a developer license which I'm not going to pay for an App I won't make any money off :)
-
I would definitely like to see some sort of option for manual input, where very little input is necessary.
This would be great for spectacting.
It would be cool to open up a stream of a game in progress and to be able to punch in the current score and level and to get an accurate idea of the player's pace.
I see three inputs, something like:
1. L1-L4 Start: ______
OR (if the start score is unknown, you could click one of several estimated values, based on the player, etc.)
Estimated Start:
Low (100K)
Medium (115K)
High (130K)
2. Current Score: _______
3. Current Screen: __ __
And from those inputs the program would spit out the following:
Level Avg: __________
Pace:
At Current Average: _________
Low 1-hammer (44K/L): ________
Moderate 1-hammer (47K/L): _______
High 1-hammer/Low 2-hammer (50K/L): _______
Moderate 2-hammer (55K/L): _______
Extreme Pressing (62K/L): _______
Something like that.
At the moment, I use an Excel spreadsheet to do this, but it's a tad unwieldy and an app would be better.
This is much like how my GPS (at least it claims to) learn my driving habits and give me a better estimate of my ETA.
funspit
-
I want to do some experimenting with automating score input on arcade. I would like a few different arcade streamers to send me a video of one of your sessions or part of a session that was recorded directly from the webcam (i.e., not a twitch stream archive). The reason is that Pauline will have direct access to the webcam which should allow for higher quality images than those from something like a twitch archive. I would like to get a good variety of videos with different webcams and different setups, so even if you don't have a perfect streaming setup with the webcam pointed directly at the screen, you should still send a video (in fact, I might be more interested in these lower quality streaming setups). The video itself should have image quality as high quality as possible, with frame rate being much less important than image quality. The preferred video format is AVI. If you can't send a video even screenshots taken from the webcam would be helpful. You can either post videos/screenshots in this thread or PM them to me.
-
Next time I stream, I'll record directly from the webcam.
This would be awesome!
-
But the webcam will be in use by OBS or Xsplit so how will you access it exactly? I think its awesome idea and cant wait just wanted to point that out I have all kinds of issues with skype/xsplit with "Webcam is in use"
Dan
-
I have a good sample from my stream, Ross's 1,136,700. There are actually two stream going in the video giving effectvly two good data sets.
Download the FLV in zip format here...
http://www.athometech.com/files/DK1136700.zip (http://www.athometech.com/files/DK1136700.zip)
-Ken
-
Edit: disregard prior comment... realized Dan was talking to Jeff, not me.
-
But the webcam will be in use by OBS or Xsplit so how will you access it exactly? I think its awesome idea and cant wait just wanted to point that out I have all kinds of issues with skype/xsplit with "Webcam is in use"
Dan
You might have to use something like manycam (http://manycam.com/) or splitcam (http://splitcamera.com/), which allow the webcam to be used by multiple applications at once. It's kind of unfortunate that this would be required since it adds a step to run Pauline on arcade, but I'm not sure that there is any better solution. I'll look into this more later once Pauline can actually work on arcade to see if there is a better solution.
-
Sweet I'd still do it if it works. I miss pauline :D
But the webcam will be in use by OBS or Xsplit so how will you access it exactly? I think its awesome idea and cant wait just wanted to point that out I have all kinds of issues with skype/xsplit with "Webcam is in use"
Dan
You might have to use something like manycam (http://manycam.com/) or splitcam (http://splitcamera.com/), which allow the webcam to be used by multiple applications at once. It's kind of unfortunate that this would be required since it adds a step to run Pauline on arcade, but I'm not sure that there is any better solution. I'll look into this more later once Pauline can actually work on arcade to see if there is a better solution.
-
So I've attempted to train Tesseract, the OCR engine used by Pauline, for the DK font. This was done to make reading the score on arcade easier, and also to improve accuracy on MAME. At the moment it's not clear whether the training has offered any improvement for accuracy on arcade, and I want to know if it improved MAME accuracy at all. I know that Pauline was reading the score wrong for a few people and I would like to know if the training fixes that. So if you are one of those people who were having problems, can you download this attached file, eng.traineddata, and put it in the tessdata directory of Pauline, replacing the existing eng.traineddata file and see if it improves at all, and let me know the results.
-
I was having a few minor issues with it: not displaying points from deaths and on occasion over inflated barrel board screen scores .Might have something to do with the points from death carrying over into the total for the board. I'll grab the attachment when I get home from work. Thanks Jeff!
-
I've done a few tests and the new eng.traineddata file has worked flawlessly. The old file would miss identify many of the digits on my screen. Great work !
-
Nice, Jeff! I was just reading about how to train tesseract for the DK font the other day. Pauline doesn't read some digits correctly for me and I've noticed others having the same problem. I've tried using tessercat for .NET and I couldn't get consistent results with it even after some preprocessing of the image. I'm using Asprise OCR right now and it works perfectly. I'll give the new file a test.
-
I've done a few tests and the new eng.traineddata file has worked flawlessly. The old file would miss identify many of the digits on my screen. Great work !
That's good to hear. Hopefully I can get some results from a few other people as well to make sure this new file works better. It seems that on arcade this new file causes worse performance than the default file. I've reached near-perfect OCR accuracy for arcade on my test video using image pre-processing, but only with the default eng.traineddata; the new one still provides pretty good accuracy but not as good as the default. It's probably because the font I created to train Tesseract doesn't take into account the glow that happens on arcade, where pixels light up surrounding pixels and make things like the hole in the middle of the 4 difficult to see.
-
Can I get any more people to send me a video? I've received one video from Hank that had a high enough resolution for score reading to be successful. Ken's video, while useful for testing things like the score location and orientation detection algorithm, was too low resolution to accurately read the score, even with image preprocessing. The video doesn't need to be long, about 30 minutes should be fine, but it should be high resolution, ideally the maximum resolution that your webcam supports. This is because when Pauline is running on arcade it will be able to select the maximum resolution supported by the webcam and take images of that resolution.
-
my cab isnt running atm but here is some old vids i had lying around, dont think settings were optimal but may be of use...
http://www.filedropper.com/test93 (http://www.filedropper.com/test93)
http://www.filedropper.com/2013-03-160809069621 (http://www.filedropper.com/2013-03-160809069621)
i also have raw hd camcorder footage if its any use, .mts...
8)