retry raw1394_read/write/lock/lock64 with delays after ack-busy

Applications or higher-level libraries have retry strategies of their own
in place, but they don't work too well sometimes.  For example, old
Panasonic camcorders require pauses in the order of several milliseconds
between response of a former transaction and request of the next one,
but libavc1394 and libiec61883 retry already after 20 microseconds.

This change cures all FCP transaction failures ("send oops") in kino and
dvgrab that I was getting with Panasonic NV-DX110.  According to reports,
Panasonic AG-EZ30 and Grundig Scenos DLC 2000 were affected too.

The additional latency in raw1394_read/write/lock/lock64 appears to be
the better alternative compared to terminal I/O failures.  Besides, a
caller of this blocking request API should at least in theory be
prepared to cope with transaction durations in the order of a few seconds.
IEEE 1394 specifies split transaction timeouts of up to 8 seconds.  An
application which needs more control should use the non-blocking request
API, i.e. raw1394_start_read/write/lock/lock64.

We specifically only retry after ack-busy, not after any of the other
types of transaction failures that may or may not succeed if retried.

This change is only done in the firewire-core backend (a.k.a. juju).
The same could be added to the raw1394 backend (a.k.a. linux1394) but is
not as important there, perhaps because transaction completion latency
in the ieee1394 core very much increases the success rate of existing
retry code in libavc1394 and friends.

Note, this does not fix every and all FCP transaction problems.  There
are e.g. certain JVC camcorders which do not properly complete FCP
transactions if an application frequently polls for status or requests
status right before a control request, even with an order of magnitude
greater delays than used in this patch.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Reviewed-by: Dan Dennedy <dan@dennedy.org>
This commit is contained in:
Stefan Richter 2011-08-13 18:19:12 +02:00
parent db5f202d5d
commit 8e433bf584
1 changed files with 29 additions and 1 deletions

View File

@ -17,6 +17,7 @@
#include <sys/ioctl.h>
#include <sys/epoll.h>
#include <sys/inotify.h>
#include <time.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
@ -1281,7 +1282,7 @@ sync_callback(raw1394handle_t handle, void *data, raw1394_errcode_t err)
}
static int
send_request_sync(raw1394handle_t handle, int tcode,
_send_request_sync(raw1394handle_t handle, int tcode,
nodeid_t node, nodeaddr_t addr,
size_t in_length, void *in, size_t out_length, void *out)
{
@ -1309,6 +1310,33 @@ send_request_sync(raw1394handle_t handle, int tcode,
return (errno ? -1 : 0);
}
static int
send_request_sync(raw1394handle_t handle, int tcode,
nodeid_t node, nodeaddr_t addr,
size_t in_length, void *in, size_t out_length, void *out)
{
static const struct timespec delay = {
.tv_sec = 0,
.tv_nsec = 10 * 1000 * 1000,
};
fw_handle_t fwhandle = handle->mode.fw;
int i, ret;
/*
* Retry after ack-busy, but give the responder some breathing room.
* 10 ms delay between retries is proper for ancient camcorders.
*/
for (i = 1; ; i++) {
ret = _send_request_sync(handle, tcode, node, addr,
in_length, in, out_length, out);
if (ret == 0 || i == 10 || fwhandle->err != -RCODE_BUSY)
break;
nanosleep(&delay, NULL);
}
return ret;
}
int
read_config_rom(fw_handle_t handle, nodeid_t node, nodeaddr_t addr,
size_t length, quadlet_t *buffer)