13

I have two functions, scheduleScan() and scan().

scan() calls scheduleScan() when there's nothing else to do except scheduling a new scan, so scheduleScan() can schedule a scan(). But there's a problem, some jobs run twice.

I want to make sure that only one job is being processed at any given time. How can I achieve that? I believe it has something to do with done(), (it was in scan(), removed now) but I couldn't come up with a solution.

Bull version: 3.12.1

Important late edit: scan() calls another functions and they may or may not call other functions, but they're all sync functions, so they only call a function when their own jobs are completed, there is only one way forward. At the end of the "tree", I call it, the last function calls scheduleScan(), but there can't be two simultaneous jobs running. Every single job starts at scan(), by the way, and they end with scheduleScan(stock, period, milliseconds, 'called by file.js')

export function update(job) {
  // does some calculations, then it may call scheduleScan() or
  // it may call another function, and that could be the one calling
  // scheduleScan() function.
  // For instance, a function like finalize()
}

export function scan(job) {
  update(job)
}


import moment from 'moment'
import stringHash from 'string-hash'
const opts = { redis: { port: 6379, host: '127.0.0.1', password: mypassword' } }
let queue = new Queue('scan', opts)

queue.process(1, (job) => {
  job.progress(100).then(() => {
    scan(job)
  })
})

export function scheduleScan (stock, period, milliseconds, triggeredBy) {
  let uniqueId = stringHash(stock + ':' + period)

  queue.getJob(uniqueId).then(job => {
    if (!job) {
      if (milliseconds) {
        queue.add({ stock, period, triggeredBy }, { delay: milliseconds, jobId: uniqueId }).then(() => {
          // console.log('Added with ms: ' + stock + ' ' + period)
        }).catch(err => {
          if (err) {
            console.log('Can not add because it exists ' + new Date())
          }
        })
      } else {
        queue.add({ stock, period, triggeredBy }, { jobId: uniqueId }).then(() => {
          // console.log('Added without ms: ' + stock + ' ' + period)
        }).catch(err => {
          if (err) {
            console.log('Can not add because it exists ' + new Date())
          }
        })
      }
    } else {
      job.getState().then(state => {
        if (state === 'completed') {
          job.remove().then(() => {
            if (milliseconds) {
              queue.add({ stock, period, triggeredBy }, { delay: milliseconds, jobId: uniqueId }).then(() => {
                // console.log('Added with ms: ' + stock + ' ' + period)
              }).catch(err => {
                if (err) {
                  console.log('Can not add because it exists ' + new Date())
                }
              })
            } else {
              queue.add({ stock, period, triggeredBy }, { jobId: uniqueId }).then(() => {
                // console.log('Added without ms: ' + stock + ' ' + period)
              }).catch(err => {
                if (err) {
                  console.log('Can not add because it exists ' + new Date())
                }
              })
            }
          }).catch(err => {
            if (err) {
              // console.log(err)
            }
          })
        }
      }).catch(err => {
        // console.log(err)
      })
    }
  })
}
salep
  • 1,332
  • 9
  • 44
  • 93

2 Answers2

7

The problem, I believe is your scan function is async. So your job.progress function calls scan and then immediately calls done allowing the queue to process another job.

A solution could be to pass the done callback as a parameter to your scan and scheduleScan functions, and invoke it, once you have completed your job (or on error).

Another (better) solution could be to ensure that you always return a Promise from scan and scheduleScan, then await the promise to resolve and then call done. If doing this, make sure you chain all your promise returns in your scheduleScan function.

queue.process(1, (job, done) => {
  job.progress(100).then(() => {
    scan(job)
        .then(done)
        .catch(done)
  })
})

export function scan() {
   // business logic
   return scheduleScan()
}

// Chain all of your promise returns. Otherwise
// the scan function will return sooner and allow done to be called
// prior to the scheduleScan function finishing it's execution
export function scheduleScan() {
    return queue.getJob(..).then(() => {
        ....
        return queue.add()...
        ....
        return queue.add(...)
            .catch(e => {
                 console.log(e);
                 // propogate errors!
                 throw e;
             })

}
jeeves
  • 1,871
  • 9
  • 25
  • I've edited my question, can you please check it out again, especially "Important late edit" part? Does your answer still apply in this situation? Thanks. – salep Feb 09 '20 at 12:08
  • 1
    Yes, it is still valid. From your edit, I think you are saying `scheduledScan` is always called after all other sync functions in `scan`. If this is the case, then yes, my answer is still valid. Just always return the promise that will be returned from `scheduleScan` in the `scan` function – jeeves Feb 09 '20 at 12:33
  • Again, my mistake. The first function, update(), is in scan, but update() may call another function like finalize(), and finalize() may call scheduleScan(). Please keep in mind that these happen in an order, so there are no multiple calls, I'm doing this to keep my app modular. -- Thanks – salep Feb 09 '20 at 12:50
  • 1
    Yep, same answer. If `update` calls `scheduledScan` or any number of functions between them. The key point is that you need to return the promise chain from `scheduleScan` all the way back to the `scan` function. So if `scan` calls `update` which calls `finalise` ..... Which calls `scheduleScan` the promise chain will need to be returned through all the function invocations, i.e. Just make sure you return the promise from each of these functions. – jeeves Feb 09 '20 at 13:04
  • So just to clarify my last comment. For example, if inside scan you call update. You need to return the result of update (a promise) from the scan function. – jeeves Feb 09 '20 at 13:16
4

The scan function is an asynchronous function. In you queue.process() function you have to await the scan function and then call the done() callback.

export async function scan(job) {
  // it does some calculations, then it creates a new schedule.
  return scheduleScan(stock, period, milliseconds, "scan.js");
}

queue.process(1, (job, done) => {
  job.progress(100).then(async() => {
    await scan(job);
    done();
  });
});

export async function scheduleScan(stock, period, milliseconds, triggeredBy) {
    let uniqueId = stringHash(stock + ":" + period);
    try {
      const existingJob = await queue.getJob(uniqueId);
      if (!existingJob) {
        const job = await addJob({
          queue,
          stock,
          period,
          uniqueId,
          milliseconds,
          triggeredBy
        });
        return job;
      } else {
        const jobState = await existingJob.getState();
        if (jobState === "completed") {
          await existingJob.remove();
          const newJob = await addJob({
            queue,
            stock,
            period,
            uniqueId,
            milliseconds,
            triggeredBy
          });
          return newJob;
        }
      }
    } catch (err) {
      throw new Error(err);
    }
}

export function addJob({ queue, stock, period, milliseconds, triggeredBy }) {
  if (milliseconds) {
    return queue.add(
      { stock, period, triggeredBy },
      { delay: milliseconds, jobId: uniqueId }
    );
  } else {
    return queue.add({ stock, period, triggeredBy }, { jobId: uniqueId });
  }
}

Try this! I've tried to refactor the code a bit by using async-await.

Adithya Sreyaj
  • 1,710
  • 10
  • 22
  • I've edited my question, can you please check it out again, especially "Important late edit" part? Does your answer still apply in this situation? Thanks. – salep Feb 09 '20 at 12:08