Thursday, February 18, 2010

Install Server 14.0.1 on Ubuntu 9.10

This is the procedure I followed:
  1. Download server from "http://sourceforge.net/projects/sserver/files/", these are essential components: rcssbase, rcssserver, rcssmonitor, rcsslogplayer
  2. To configure GCC and compiler you must update build_essentials... To do this use "sudo apt-get install build-essential" or use Synapthic Package Manager to update it.
  3. If you met with 'E: Couldn't find package build-essentials' see this: "http://ubuntuforums.org/showthread.php?t=436647"
  4. Get these packages afterwards (I recommend using Synaptic but you can still use sudo apt-get ) :
    • libboost-dev
    • libboost-filesystem
    • libboost-filesystem-dev
    • libx11-dev
    • libqt
    • libqt-dev
    • flex
    • bison
    • yacc
  5. Install Base :
    • ./configure
    • make
    • sudo make install
  6. Install Server:
    • ./configure
    • make
    • sudo make install
    • sudo /sbin/ldconfig
  7. Install Monitor:
    • ./configure
    • make
    • sudo make install
  8. Install Log Viewer:
    • ./configure
    • make
    • sudo make install
  9. Test the installings
    • rcsoccersim
    • rcssserver
    • rcssmonitor
    • rcsslogplayer
  10. Start a match:
    • download the UVA source code (Code)(Doc)(Read Me)
    • ./configure
    • make
    • rcsoccersim
    • ./start.sh
  11. Read the Source Code
    • PlayerTeams.cpp
    • Player.(h|cpp)
    • WorldModel.(h|cpp)
    • BasicPlayer.(h|cpp)


Sources:
http://wrighteagle.org/2D/
http://sourceforge.net/apps/mediawiki/sserver/index.php?title=Users_Manual/Overview
http://mnt.ir/Nemesis

Wednesday, February 17, 2010

RoboCup Simulation 2D Results

Robocup 2009: (Austria)
http://www.robocup2009.org/130-0-results
http://romeo.ist.tugraz.at/robocup2009/

Robocup 2008: (China)
http://www.robocup.de/RC08/standing.html
http://www.robocup.de/RC08/results.html
http://www.robocup.de/RC08/binaries.html

Robocup 2007: (USA)
https://wiki.cc.gatech.edu/robocup/index.php/Soccer_Simulation
https://wiki.cc.gatech.edu/robocup/index.php/Results#Places

Refrees

Rules Judged by the Automated Referee

Kick-Off
Just before a kick off (either before a half time starts, or after a goal), all players must be in their own half. To allow for this to happen, after a goal is scored, the referee suspends the match for an interval of 5 seconds. During this interval, players can use the move command to teleport to a position within its own side, rather than run to this position, which is much slower and consumes stamina. If a player remains in the opponent half after the 5-second interval has expired or tries to teleport there during the interval, the referee moves the player to a random position within their own half.

Goal
When a team scores, the referee performs a number of tasks. Initially, it announces the goal by broadcasting a message to all players. It also updates the score, moves the ball to the center mark, and changes the play-mode to kick_off_x (where x is either left or right). Finally, it suspends the match for 5 seconds allowing players to move back to their own half (as described above in the "Kick-Off" section).

Out of Field
When the ball goes out of the field, the referee moves the ball to a proper position (a touchline, corner or goal-area) and changes the play-mode to kick_in, corner_kick, or goal_kick. In the case of a corner kick, the referee places the ball at (1m, 1m) inside the appropriate corner of the field.

Player Clearance
When the play-mode is kick_off, free_kick, indirect_free_kick, kick_in, or corner_kick, the referee removes all defending players located within a circle centered on the ball. The radius of this circle is a parameter within the server (normally 9.15 meters). The removed players are placed on the perimeter of that circle. When the play-mode is offside, all offending players are moved back to a non-offside position. Offending players in this case are all players in the offside area and all players inside a circle with radius 9.15 meters from the ball. When the play-mode is goal_kick, all offending players are moved outside the penalty area. The offending players cannot re-enter the penalty area while the goal kick takes place. The play-mode changes to play_on immediately after the ball goes outside the penalty area.

Play-Mode Control
When the play-mode is kick_off, free_kick, kick_in, or corner_kick, the referee changes the play-mode to play_on immediately after the ball starts moving through a kick command.

Offside
A player is marked offside, if it is
  • in the opponent half of the field,
  • closer to the opponent goal than at least two defending players,
  • closer to the opponent goal than the ball,
  • closer to the ball than 2.5 meters (this can be changed with the server parameter offside_active_area_size).
Backpasses
Just like in real soccer games, the goalie is not allowed to catch a ball that was passed to him by a teammate. If this happens, the referee calls a back_pass_l or back_pass_r and assigns an indirect free kick to the opposing team at the ball caught point but the outside of goal areas. Note, that it is perfectly legal to pass the ball to the goalie if the goalie does not try to catch the ball.

Free Kick Faults
When taking a free kick, corner kick, goalie free kick, or kick in, a player is not allowed to pass the ball to itself. If a player kicks the ball again after performing one of those free kicks, the referee calls a free_kick_fault_l or free_kick_fault_r and the opposing team is awarded a free_kick.

As a player may have to kick the ball more than once in order to accelerate it to the desired speed, a free kick fault is only called if the player taking the free kick
  1. is the first player to kick the ball again, and
  2. the player has moved (= dashed) between the kicks.
So issuing command sequences like kick-kick-dash or kick-turn-kick is perfectly legal. The sequence kick-dash-kick, on the other hand, results in a free kick fault.

Drop ball

Half-Time and Time-Up
The referee suspends the match when the first or the second half finishes. The default length for each half is 3000 simulation cycles (about 5 minutes). If the match is drawn after the second half, the match is extended. Extra time continues until a goal is scored. The team that scores the first goal in extra time wins the game. This is also known as the "golden goal"" rule or "sudden death".

Rules Judged by the Human Referee

Fouls like "obstruction" are difficult to judge automatically because they concern players' intentions. To resolve such situations, the server provides an interface for human-intervention. This way, a human-referee can suspend the match and give free kicks to either of the teams. The following are the guidelines that were agreed prior to the RoboCup 2000 competition, but they have been used since then.
  • Surrounding the ball
  • Blocking the goal with too many players
  • Not putting the ball into play after a given number of cycles. By now this rule is handled by the automatic referee, as well. If a team fails to put the ball back into play for drop_ball_time cycles, a drop_ball is issued by the referee. However, if a team repeatedly fails to put the ball into play, the human referee may drop the ball prematurely.
  • Intentionally blocking the movement of other players
  • Abusing the goalie's catch command (the goalie may not repeatedly kick and catch the ball, as this provides a safe way to move the ball anywhere within the penalty area).
  • Flooding the Server with Messages. A player should not send more than 3 or 4 commands per simulation cycle to the soccer server. Abuse may be checked if the server is jammed, or upon request after a game.
  • Inappropriate Behavior. If a player is observed to interfere with the match in an inappropriate way, the human-referee can suspend the match and give a free kick to the opposite team.
Source:
http://sourceforge.net/apps/mediawiki/sserver/index.php?title=Users_Manual/Overview

Tuesday, February 16, 2010

Brainstormers: NeuroHassle

Defending against incoming attacks and recapturing the ball is a crucial task for each team. Defending strategy consist of two sub-task: Positioning and Hassling. The former task aims to arrange players in free spaces so that they are capable of intercepting potential opponent passes, covering the direct defending player, marking the attacker player possesses the ball, and avoiding opponent to have clear shoot toward the goal. The latter task is to improving the aggression skill of defender in the manner that they can interfere the opponent ball leading player, “hassle” him, and bringing ball under their control while simultaneously hindering him from dribbling ahead. Moreover, the assignment of these two tasks is challenging because they can conflict and result in two undesired situations: no one interferes the attacker or two players decide to hassle the ball leader and leave a breach in defensive formation or leaving an opponent player uncovered. Also this assignment should maximize the collaborative defense utility. A common choice for this assignment is to give the task of hassling to the closest player to the ball while others maintain a good defensive coverage formation.
Conquering the ball from an attacking player is risky and difficult to implement, because (i) it’s hard to devise a trivial scheme to handle the broad variety of utilized dribbling strategies (ii) risk of over-specializing to some type of dribble strategies and loss of generalization for others that lowers the overall efficiency of the scheme and (iii) the importance of a duel between attacker and defender: if the defending player looses this duel, the attacker overruns him, and will achieve more space and better opportunities with few defenders ahead.

“Brainstromers” team has employed an effective scheme for the hassling task since Robocup 2007 competitions called neuroHassle. We are working on an enhanced version of this approach to be embedded in our block mechanism. The goal of this problem is to train defensive agents with reinforcement learning to hassle an attacker. In the other words, a given naïve defender finds a policy by trial and error, to conquer the ball from an opponent ball leading player with no a priori knowledge about his dribbling capabilities. The proposed reinforcement learning solution is value function estimation by a multi layer perceptron neural network. The architecture of our proposed solution differs slightly from the one explained in [] yet use similar basics and training concepts.

Architecture:
A MLP neural network with one hidden layer consists of 20 neurons with sigmoidal activation function. The neural network training is run in batch mode and uses back-propagation to minimize the mean square error of the value function approximation.
Inputs: These features are extracted from the environment and fed to the neural network.
1. Distance between defender and ball possessing attacker (Scalar)
2. Distance between ball and our goal (Scalar)
3. Velocity of defender (Vectored and Relative)
4. Velocity of attacker (Scalar: The absolute value of velocity)
5. Position of the ball (Vectored and Relative)
6. Defender body angle (Relative)
7. Attacker body angle (Relative to his direction toward our center of goal)
8. Strategic angle (GÔM: G is the center of goal, O is the position of the opponent, and M is the position of our player
9. Stamina of the defender
The coordinated system is centered on the center of our player and the abscissa is aligned through our and the opponent player. The degree of partial observability is kept low.

Training:
A large training data set should be provided for this task. This data set should cover various velocities and body angles of players and initial position of ball between them (to handle different start up situation for dribbling and defending), various regions of field (because dribbling players are very likely to behave differently depending on where they are positioned on the field), different adversary agent (to avoid over-specialization and maintain generalization), and different stamina size of defender (to consider realistic situation of the game).
Reinforce Signal: The outcome of a training scenario can be categorized in several groups. Regarding this outcome, a different reinforcement should be given to the agent:
  • Erroneous Episode: Failure due to losing the ball by attacker because of a mistake, go out of the field, wrong self localization of the agent etc. is known as erroneous episodes and is omitted from training data.
  • Success: Conquering the ball by the defender whether he has the ball inside of his kickable area or has a probably successful opportunity of tackling. This outcome will be rewarded by a great value.
  • Opponent Panic: A non-dribbling behavior of attacking ball leading opponent player. This behavior takes place (i)when a defender approaches the attacker, (ii) when the defender hassles him too much , or (iii) when he simply do not consider the situation as a suitable one for dribbling. In these cases the attacker kicks the ball as a pass, toward goal or somewhere else (usually forward). This outcome is considered as a draw and with respect to the type of shoot to be toward the goal or not, we penalize or reward the situation by a small value.
  • Failure: If none of the other cases has happen. This means that attacker has the ball in his kick range and overrun defender by some distance, or has approached the goal such that a goal shot is hardly stoppable. This outcome is punished by a large value.
  • Time Out: If the struggle over the ball doesn’t come into one of above mentioned states within a reasonable time. This situation will be punished or rewarded based on the offset of the ball from its initial position.
The learning task of this problem is episodic and the scenario is reset after each episode so there’s no need for discounting and the learning rate used should be 1.0. Also to enable exploration to find better and more effective solution for defense we use criteria of energy saving mixed with Boltzman exploration to modify online greedy policy during training. The idea behind this choice is that although large sets and random episodes with start situation brings about a good level of state space exploration as assumed in the paper, but the found policy may be not efficient in the terms of stamina, and yet may not cover various dribbling tricks enough and not generalized properly.

Actions:
An agent is allowed to choose the low level actions of turn(x) and dash(y) where the domains of bots commands’ parameters (x from [−100, 100], y from [−180◦, 180◦]) are discretized such that in total 76 actions are available to the agent at each time step.

Although the effectiveness of policy will be influenced by the presence of other players in the field and the attacker may behave differently, but by a good formation of other defenders, so that passing between opponent players become more risky, this policy gains more importance.

Future Works:
  • Enable a defender to shout for help if his stamina level decreases to a critical level;
  • When the score of the team is in good winning margin, the defenders tries to reach a state of Time Out and save more energy by preventing a player to dribble ahead;
  • When the attacking team has ball in their defensive area and a gap in the midfield, our players start to hassle them from opponent defensive area to conquer the ball and gain good chance of scoring;
  • Train a defender to hassle when one more player from each team of attacker and defender are present in the field to enable hassling player to block the passes from the source.
Reference:
http://www.springerlink.com/content/p15566725w553751/