Home
JAQForum Ver 24.01
Log In or Join  
Active Topics
Local Time 23:17 02 Aug 2025 Privacy Policy
Jump to

Notice. New forum software under development. It's going to miss a few functions and look a bit ugly for a while, but I'm working on it full time now as the old forum was too unstable. Couple days, all good. If you notice any issues, please contact me.

Forum Index : Microcontroller and PC projects : weird loss of program issue

     Page 7 of 8    
Author Message
twofingers

Guru

Joined: 02/06/2014
Location: Germany
Posts: 1593
Posted: 07:03am 05 Sep 2015
Copy link to clipboard 
Print this post

  TZAdvantage said   Something you can not control. If the person connecting the power removes it at that exact time you have the failure. It can be anything 10ms, 100ms, 1000ms, etc.
As such there is no good delay time. A reason why even a supervisory chip will not take that risk away.


Maybe you misinterpreted me? But this was my fault! I did not make it clear.

My suggestion was made to verify the error.
If our assumptions are correct, then the error should no longer occur (or be considerably reduced) if we have a delayed start.
I can't imagine a bounce of 500ms or more in a good designed circuit. Edited by twofingers 2015-09-06
causality ≠ correlation ≠ coincidence
 
MicroBlocks

Guru

Joined: 12/05/2012
Location: Thailand
Posts: 2209
Posted: 07:22am 05 Sep 2015
Copy link to clipboard 
Print this post

It can be called a power failure (or just someone pulling the plug), and they can occur at all times.
If the error is at startup then no matter what delay there is it will not solve the problem, it only moves it to a later moment.
It only solves the moment of connecting a power supply. That can be very 'dirty' with lots of spikes and jumping between zero and the required voltage.
However if you remove the power supply, that is something you can not control.
If that happens during a flash write the problem occurs.
With a delayed start the chances of a failure are smaller, but not gone.
And for someone 1 failure out of 100 is acceptable, for me that would not be the case.
1 out of 100 is within 3-4 months when it is switched on/off once a day.
When in a car it might be within a week.
I would expect no failures because it can be prevented from software.


Edited by TZAdvantage 2015-09-06
Microblocks. Build with logic.
 
twofingers

Guru

Joined: 02/06/2014
Location: Germany
Posts: 1593
Posted: 07:35am 05 Sep 2015
Copy link to clipboard 
Print this post

  TZAdvantage said   If that happens during a flash write the problem occurs.


To make it clear:
I think this means absolutely no flash writes - no OPTIONS in MMBasic? Since a power failure can occur at any time.
causality ≠ correlation ≠ coincidence
 
MicroBlocks

Guru

Joined: 12/05/2012
Location: Thailand
Posts: 2209
Posted: 08:02am 05 Sep 2015
Copy link to clipboard 
Print this post

Yes.

It is also why some OPTIONS can(should) only be set at the prompt and not from code.

If you have a product that must be highly dependable and is not in easy reach of someone who can fix it, then yes no flash writes should be done.
If really needed, use an external flash memory chip.

As the uMite is moving more and more into real products this becomes important.
It is not only on Geoffs plate to solve this. A good understanding of what can happen (and Murphy says it will!) can prevent most if not all of the problems.
Writing to flash can be very useful, just keep in mind that it is a risk.
A risk that is for many acceptable and for some not. Each and everyone in combination with the project/product has to make that decision if the risk is worth it.

My strategy would be to set all options first then load the program. Then read the flash content to a hex file and copy that hex to all other chips. No options will ever have to be set by the program after that.

Microblocks. Build with logic.
 
twofingers

Guru

Joined: 02/06/2014
Location: Germany
Posts: 1593
Posted: 08:31am 05 Sep 2015
Copy link to clipboard 
Print this post

  TZAdvantage said   As the uMite is moving more and more into real products this becomes important.
It is not only on Geoffs plate to solve this. A good understanding of what can happen (and Murphy says it will!) can prevent most if not all of the problems.
Writing to flash can be very useful, just keep in mind that it is a risk.
A risk that is for many acceptable and for some not. Each and everyone in combination with the project/product has to make that decision if the risk is worth it.

My strategy would be to set all options first then load the program. Then read the flash content to a hex file and copy that hex to all other chips. No options will ever have to be set by the program after that.


Thanks!
Maybe this should find his way into the manual?!
If our conclusions are correct.
causality ≠ correlation ≠ coincidence
 
Geoffg

Guru

Joined: 06/06/2011
Location: Australia
Posts: 3292
Posted: 01:49pm 05 Sep 2015
Copy link to clipboard 
Print this post

Sorry to "bust everyone's bubble" but the firmware does check the current option setting and does not write to the flash if it is the same.

The reason why WW is consistently corrupting the flash may be because he is re flashing the firmware after each test (because the previous test corrupted the flash!) and then MMBasic must write to the flash to set the option.

A way of fixing this issue might be to delay startup (with PAUSE 500) to give the power supply time to settle but, as TZ pointed out, any power failure while writing to flash could be fatal and that could happen at any time (even 500ms later).

I cannot see what MMBasic can do to avoid corrupting the flash if the power fails during a write to flash. The solution will have to be hardware based - or avoid using any command that causes a write to flash.

Geoff
Geoff Graham - http://geoffg.net
 
TassyJim

Guru

Joined: 07/08/2011
Location: Australia
Posts: 6283
Posted: 02:17pm 05 Sep 2015
Copy link to clipboard 
Print this post

OPTION AUTORUN ON is remembered and only reset with a NEW command so there is no need to have it in the program.

It is convenient to have it in the program code (so you don't forget).

Peter's Cfunction allows you to do a conditional test and only set it if needed - on first run.
That could be taken a step further and set a flag in flash and group all the 'first run only' settings under the one conditional statement.

Put your VAR SAVE late in the program and the VAR RESTORE early.

It might be easier to explain with some code. I will try and post some later.

Jim
VK7JH
MMedit
 
viscomjim
Guru

Joined: 08/01/2014
Location: United States
Posts: 925
Posted: 02:37pm 05 Sep 2015
Copy link to clipboard 
Print this post

So I did a simple test and videod it to show what is happening on my end. This is very controlled as the ONLY difference between the two tests is one is running 4.5E and the other is running 4.7b23. Here is the simple program the board is running in both tests...

OPTION AUTORUN ON

SetPin 4 ,DOUT 'Output 1
SetPin 5, DOUT 'Output 2
SetPin 6, DOUT 'Output 3
SetPin 26, DOUT 'Output 4
SetPin 14,DOUT 'Status Led


DO
PIN(4)=1:PIN(5)=1:PIN(6)=1:PIN(26)=1:PIN(14)=1
PAUSE 80
PIN(4)=0:PIN(5)=0:PIN(6)=0:PIN(26)=0:PIN(14)=0
PAUSE 80
LOOP

I am applying and removing power with the barrel connector. You can see that in THIS video, when the board is running 4.5E, I can't get it to fail. I can do this for a while and get absolutely no failure. In THIS video, the only difference is I loaded the board with 4.7b23. You can see that after about 4 connections, the program is gone. I can't help but think that there is something slightly different in 4.5E that is not affected by "dirty" power and startup. I wish I knew how to dig further, but this is about the extent of what I can test. It is VERY consistent. 4.5E seems rock solid.
 
TassyJim

Guru

Joined: 07/08/2011
Location: Australia
Posts: 6283
Posted: 02:46pm 05 Sep 2015
Copy link to clipboard 
Print this post

Jim,
Can you put this code at the start of your test code and see what happens.

OPTION EXPLICIT
DIM firstrun
VAR RESTORE
' PRINT firstrun 'DEBUG
IF firstrun = 0 THEN
PRINT "Welcome to your first run"
PAUSE 1000
OPTION AUTORUN ON
firstrun = 1
VAR SAVE firstrun
ELSE
PRINT "Welcome Back"
ENDIF

I tried to get it to fail on my 64pin MX470 but only managed to corrupt the attached 7inch TFT.

The TFT was OK after a longer power-down period so I could stop panicking.

If the above works, it is a "temporary fix" but I agree that there is something that needs to be tracked down.


VK7JH
MMedit
 
viscomjim
Guru

Joined: 08/01/2014
Location: United States
Posts: 925
Posted: 03:11pm 05 Sep 2015
Copy link to clipboard 
Print this post

TassyJim, I added your code to the beginning of my simple led test using 4.7b23 which would normally be dead in about 4 or 5 tries. Now with your code, I can't kill it, very cool. However, another weirdness does crop up. Even though the leds flash like they are supposed to every time I power up the unit, after about 15 to 20 times into it, the power up banner does not appear any more. Tried in mmedit and tera term and same results. I can do a control c and the program stops and I can type run, and the program starts again. Just no more welcome banner or any TX from the uMite. Very weird. Still, can't kill it with the dirty power, so I think you are on to something, just not sure what.
 
viscomjim
Guru

Joined: 08/01/2014
Location: United States
Posts: 925
Posted: 03:21pm 05 Sep 2015
Copy link to clipboard 
Print this post

Just reloaded 4.7b23 and tried again, same exact failure. The code runs properly, can't kill it with power, but after a few tries, the TX goes away. I can still control C and RUN and all works great, just no tx from the uMite. The program keeps going though. WIERD. This does not happen with 4.5E however.
 
Geoffg

Guru

Joined: 06/06/2011
Location: Australia
Posts: 3292
Posted: 03:30pm 05 Sep 2015
Copy link to clipboard 
Print this post

  viscomjim said  This does not happen with 4.5E however.

Just to be clear, you are using both versions of the firmware on the same chip. Correct?

  viscomjim said  Peter's Cfunction allows you to do a conditional test and only set it if needed - on first run.

That is not needed - MMBasic does a similar test in the firmware.Edited by Geoffg 2015-09-07
Geoff Graham - http://geoffg.net
 
viscomjim
Guru

Joined: 08/01/2014
Location: United States
Posts: 925
Posted: 03:35pm 05 Sep 2015
Copy link to clipboard 
Print this post

Yes, both tests are using the same board with a 170 SOIC chip. Only change is using 4.5E or 4.7b23.

In my last post where I said, "this does not happen with 4.5E", that did not apply to adding Tassy Jim's code. There is no problem using 4.5E.

EDIT

I think you are confusing me with someone else. I have not tried Peter's C function at all.Edited by viscomjim 2015-09-07
 
TassyJim

Guru

Joined: 07/08/2011
Location: Australia
Posts: 6283
Posted: 03:58pm 05 Sep 2015
Copy link to clipboard 
Print this post

  viscomjim said   Just reloaded 4.7b23 and tried again, same exact failure. The code runs properly, can't kill it with power, but after a few tries, the TX goes away. I can still control C and RUN and all works great, just no tx from the uMite. The program keeps going though. WIERD. This does not happen with 4.5E however.


JIm,
can you look at the TX line with a CRO to see if it doing anything. That will tel us if the uM has stopped or it's still Tx'ing and you USB-TTL converter is stuck. Perhaps a handshake line got hit.

Geoff,
I suggested the Cfunction but decided that any SAVE'd VAR will do as a flag.
The idea of having a 'first run' bit of code will come in handy sometime so it was not a wasted effort for me.

I haven't been able to get any of my MX170's of MX470 to fail at all so I can only make guesses about the problems that some are having.

Jim

VK7JH
MMedit
 
WhiteWizzard
Guru

Joined: 05/04/2013
Location: United Kingdom
Posts: 2944
Posted: 04:16pm 05 Sep 2015
Copy link to clipboard 
Print this post

Found my PicKit3 (it was under my keyboard )

Using a simple two line program:
OPTION AUTORUN ON
Print "Hello"


On a 28-pin MX170 DIP with a 3.7v LiPo powering MuP which has a 3v3 LDO with a 10uF Vcap. Genuine FTDI USB-to-Serial connected to MM console pins into TeraTerm.

When 4.6 or 4.7 is loaded I can 100% of the time wipe the program with 'bounced' power (between LiPo and LDO in)

With 4.5E - I just can't get it to wipe.

This is consistent with what I observed before (and also what viscomjim is seeing)

The same thing happens with a 28-pin MX170 SOIC - but have not tried a 44-pin yet

One other thing I noticed with 4.7 (although if FLASH gets corrupted I would expect strange things to happen) is that I once got an error message "Cannot use reserved pin." (and the editor switched into OPTION COLOURCODE ON even though I hadn't selected this!) BUT the two-line program was still in tact.

WW

EDIT: To be clear, this is on the same chip and on the exact same hardware, with the exact same two-line program - the only difference is the firmware version flashed onto the PIC Edited by WhiteWizzard 2015-09-07
 
viscomjim
Guru

Joined: 08/01/2014
Location: United States
Posts: 925
Posted: 04:30pm 05 Sep 2015
Copy link to clipboard 
Print this post

EDIT: To be clear, this is on the same chip and on the exact same hardware, with the exact same two-line program - the only difference is the firmware version flashed onto the PIC.

Thats what I meant to say also...
 
MicroBlocks

Guru

Joined: 12/05/2012
Location: Thailand
Posts: 2209
Posted: 08:20pm 05 Sep 2015
Copy link to clipboard 
Print this post

@WW, could you do the following.

1) Burn the chip with firmware
2) Start it
3) enter your 2 line code
4) run it
5) Connect the pickit and read the flash. save it in a hex call it step5.hex
6) Keep applying power until it fails
7) Load the step5.hex into IPE
8) Verify the flash with the loaded step5.hex

If there was nothing written to the flash, as it should then there would be no difference and the verify should succeed.
If there was a write to flash (this should not happen) the verify will show exactly which addresses were written to.

What i suspect is that when your program starts and executes the OPTION AUTORUN ON it DOES write to flash because it determines that that value is not set in flash.
The reason i think is that when the power is not stable the routine that checks the flash gets wrong values as those are corrupted by the unstable power supply.

A test to minimize this 'effect' is to add a PAUSE 1000 as the first line of your code. A dirty power line connection would then be ignored and your program will only reach that line when power is on for at least a second.

But as Matherp showed there is some code in the firmware that checks the flash for default values and does a flash write when that area is not initialized.

To rule out hat routine in the firmware you should also be able to get a failure when a program only contains one line like PRINT HELLO

Microblocks. Build with logic.
 
MicroBlocks

Guru

Joined: 12/05/2012
Location: Thailand
Posts: 2209
Posted: 08:28pm 05 Sep 2015
Copy link to clipboard 
Print this post

There is one thing i like to add to the part of the discussion about using a supervisory chip or other forms of a delayed startup.

When using a pickkit3 to program a chip 'in circuit' is specifically pointed out by Microchip that no capacitor should be connected to the MCLR pin.
This will cause the pickit3 to not function as a rapidly changing value on that pin will be smoothed out by the capacitor and the critical timing for programming is lost.
When you buy an original pickkit3 it comes with a poster/manual that has this picture on it to show what should not be done.




Microblocks. Build with logic.
 
viscomjim
Guru

Joined: 08/01/2014
Location: United States
Posts: 925
Posted: 04:58am 06 Sep 2015
Copy link to clipboard 
Print this post

  TassyJim said  
JIm,
can you look at the TX line with a CRO to see if it doing anything. That will tel us if the uM has stopped or it's still Tx'ing and you USB-TTL converter is stuck. Perhaps a handshake line got hit.


I won't be able to do that till Monday as my scope is at work. I have a feeling its my usb serial dealio acting out.

Can you tell me what your theory is that when using your code, 4.7 seems to work ok with the bad power setup? (= barrel power connector).

I received a call last night from my customer that another board "went dead". Thankfully he is local and right now I am just reloading the "bad" boards with 4.5E and my modified program. I am now carrying around a pickit 3 with pogo pins attached and using the programmer-to-go feature to make the fix on site (works quite well actually). So far, no failures reported after reloading the new setup.

I will set up an experiment with the TassyJim fix code and 4.7 and "beat it up" for a while to see if this will last. So far so good.
 
WhiteWizzard
Guru

Joined: 05/04/2013
Location: United Kingdom
Posts: 2944
Posted: 08:08am 07 Sep 2015
Copy link to clipboard 
Print this post

@viscomjim

Did you try this?

  WhiteWizzard said  JIM: Try your code on v4.7 but comment out OPTION AUTORUN ON (I bet it will work!). If it does fail, then are you using any other commands that 'talk' to FLASH? If so then comment these out too . . . .


WW
 
     Page 7 of 8    
Print this page
The Back Shed's forum code is written, and hosted, in Australia.
© JAQ Software 2025