6.8 Creating Your Own Optimisation Task

In order to implement your own task firstly look at how we have done them and try to adapt one of ours to your needs. As long as the format of your task is in the same structure as the examples it should work. Include them by adding them into the optimizationbehaviors.cc and optimizationbehaviors.h files. These files contain the definitions are of all the tasks we have created.

Below is the example of the optimisation task you can use to optomise the kick paramters.

class OptimizationBehaviorKickScore: public NaoBehavior {
    const string outputFile;

    double timeStart;
    bool hasScored;
    bool beamChecked;
    bool backwards;
    bool ranIntoBall;
    bool fallen;

    int kick;

    double INIT_WAIT_TIME;
    int numscores_previously = 0;

    VecPosition ballInitPos;

    bool checkbeam();
    void initKick();
    void writeFitnessToOutputFile(double fitness);

public:

    OptimizationBehaviorKickScore(const std::string teamName, int uNum, const map<
                                  string, string>& namedParams_, const string& rsg_, const string& outputFile_);

    virtual void beam(double& beamX, double& beamY, double& beamAngle);
    virtual SkillType selectSkill();
    virtual void updateFitness();

};

The above code is found in the optimizationbehaviors.h header file and similar to the Optimisers we talked about all you need to do is keep to the same structure in order to create your own. Specifically when you implement your own task you need to have the following functions:

Constructor - For the constructor you can simply copy this one and change the name
beam - The beam function places the agent in the starting location
selectSkill() - The selectSkill function defines the behaviour of the agent during the task
updateFitness() - The updateFitness function can be thought of as a reward function which is used to keep track and update the current agents performance (fitness).

The implementations of these can be found in the corresponding optimizationbehaviors.cc file, lets look at them individually

Constructor

Below is the code for the KickScore Tasks Constructor

OptimizationBehaviorKickScore::OptimizationBehaviorKickScore(const std::string teamName, int uNum, const map<string, string>& namedParams_, const string& rsg_, const string& outputFile_) :
    NaoBehavior(teamName, uNum, namedParams_, rsg_), outputFile(outputFile_),
    kick(0), INIT_WAIT_TIME(3.0)
{
    numscores_previously = 0;
    initKick();
}

The main part of this constructor is that it calls initKick() which does the following :

hasScored = false;
    beamChecked = false;
    backwards = false;
    ranIntoBall = false;
    timeStart = worldModel->getTime();
    initialized = false;
    initBeamed = false;
    fallen = false;
    resetSkills();

    // Beam agent and ball
    double beamX, beamY, beamAngle;
    beam(beamX, beamY, beamAngle);
    VecPosition beamPos = VecPosition(beamX, beamY, 0);
    string msg = "(playMode BeforeKickOff)";
    setMonMessage(msg);

As you can see it initialises the start of a single run of a task, this is achieved by initialising some variables like timeStart. In addition to these it calls the beam function which places our agent in the defined start location.

beam

Like mentioned the role of beam is the same as it is when used in the non optimisation code. All it does is place the agent at a given start Position and angle. In this case it is 2 units left of the middle of the field as we want our agent to score the ball from the half way line.

void OptimizationBehaviorKickScore::beam(double& beamX, double& beamY,double& beamAngle) {
    beamX = -2;
    beamY = 0;
    beamAngle = 0;
}

selectSkill()

Since the goal of the KickScore task is too score a goal our behaviour needs to emulate this. Therefore we define the selectSkill() as seen below. Initially we want to initKick() if the time was less than 0. We also wait a small delay defined by INIT_WAIT_TIME just to allow for some variance where we simply STAND. Then we check if we have scored or not as we only want to score once per task. Here you can see the hasScored parameter keeps track of this. If we havent scored then we simply return the kickBall skill with a target kick destination being the centre of the opponents goal.

SkillType OptimizationBehaviorKickScore::selectSkill() {
    double time = worldModel->getTime();
    // std::cout << "Hello in selectskill." << "kick = " << kick << " scoreleft " << worldModel->getScoreLeft() << " hasscored = " <<  hasScored <<   " time = " << timeStart << "  "  << time << " wait " << INIT_WAIT_TIME<< "\n";
    if (timeStart < 0) {
        initKick();
        return SKILL_STAND;
    }

    // Wait a bit before attempting kick
    if (time-timeStart <= INIT_WAIT_TIME) {
        return SKILL_STAND;
    }

    // Check to see if we have scored
    if(worldModel->getScoreLeft() > numscores_previously){
        hasScored = true;
        numscores_previously++;
    }

    if (!hasScored) {
        return kickBall(KICK_IK, VecPosition(HALF_FIELD_X, 0, 0)); // The kick skill that we're optimizing
    }

    return SKILL_STAND;
}

updateFitness() The last function you require in order to create a task is the updateFitness function. There is a lot going on here but some it may be unnecessary. The main thing to understand here is that you need to define which situation should update the fitness of the agent which in this case is defined by totalFitness. before we get into the functionality it is important to reiterate that we run multiple complete executions of the task before we return the fitness value back to the Optimiser. This is in order to aleviate some of the variation that can exist within the execution of a task. In this case kick keeps track of how many of these seperate executions have occured up until a limit of 10. We then divide the totalFitness by kick to get the average. Understanding this fact means that whenever you update the fitness, requires you to either add or subtract. If you simply say totalFitness = newvalue then you might lose the fitness information from previous runs. This is also why the initKick function doesnt set the totalFitness = 0.0. Cases where we update the fitness in this example include:

beamfailure results in decreasing fitness by 100
ball gets kicked backwards towards our own goal results in decreasing fitness by 100
ball kicked in correct direction but has not scored (int the time allowed) results in increasing the fitness by the distance the ball travelled (so this is 0 up until 15)

We allocate a maximum time allowed to complete a single instance of the task as follows: if (time - (timeStart + INIT_WAIT_TIME) > 60 ) which essentially allows an agent 60 seconds

ball scored results in increasing the fitness by an inverted time taken (this means the quicker to score means the higher the fitness). This is calculated as follows:

double timetaken = time - (timeStart + INIT_WAIT_TIME);
totalFitness += 60.0/timetaken;

You can also note in both the cases where the task should conclude we update the kick variable by 1 and we run initKick to reinitialise the task.

void OptimizationBehaviorKickScore::updateFitness() {
    static double totalFitness = 0.0;
    if (kick == 10) {
        writeFitnessToOutputFile(totalFitness/(double(kick)));
        return;
    }

    double time = worldModel->getTime();
    VecPosition meTruth = worldModel->getMyPositionGroundTruth();
    meTruth.setZ(0);

    if (time-timeStart <= INIT_WAIT_TIME) {
        return;
    }

    if (!beamChecked) {
        static bool failedLastBeamCheck = false;
        if (!checkbeam()) {
            // Beam failed so reinitialize everything
            if (failedLastBeamCheck) {
                // Probably something bad happened if we failed the beam twice in
                // a row (perhaps the agent can't stand) so give a bad score and
                // move on
                totalFitness += -100;
                kick++;
            }
            failedLastBeamCheck = true;
            initKick();
            return;
        } else {
            failedLastBeamCheck = false;
        string msg = "(playMode KickOff_Left)";
        setMonMessage(msg);
        }
    }

    VecPosition ballTruth = worldModel->getBallGroundTruth();
    if (ballTruth.getX() < -.25) {
        backwards = true;
    }

    if (worldModel->isFallen()) {
        fallen = true;
    }

    if (time - (timeStart + INIT_WAIT_TIME) > 60 ) {
        cout << "Stopping now due to time" << endl;
        double angleOffset = abs(VecPosition(0, 0, 0).getAngleBetweenPoints(VecPosition(20, 0, 0), ballTruth));
        double distance = ballTruth.getX();
        double fitness = -100;

        if (backwards || distance <= 0.1) {
            fitness = -100;
            if (backwards) {
                cout << "Detected backward kick" << endl;
            } else {
                cout << "Detected insufficient distance" << "Time Now: " << time << " Time start + wait" << (timeStart + INIT_WAIT_TIME) << endl;
            }
        }
        cout << "Traveled distance = " << distance << endl;
        cout << "Fitness = " << fitness << endl;
        cout << "Final position = " << ballTruth.getX() << ", " << ballTruth.getY() << endl;

        totalFitness += fitness + distance;
        kick++;
        initKick();
        return;
    }

    if(hasScored){
        double timetaken = time - (timeStart + INIT_WAIT_TIME);
        totalFitness += 60.0/timetaken;
        kick++;
        initKick();
        return;
    }

}

Before you can run an optimisation on this new task you need to add it to main.cc. The following is an example of what to add

else if (agentType == "walkForwardAgent") {
        cerr << "creating OptimizationBehaviorWalkForward" << endl;
        behavior = new OptimizationBehaviorWalkForward(  teamName,
                uNum,
                namedParams,
                rsg,
                outputFile);
    }

You simply need to copy this one but have a unique name for your task in the agentType parameter.